perm filename IMP.BO[11,DOC]1 blob
sn#135859 filedate 1974-12-12 generic text, type T, neo UTF8
PDP-10 IMP72 REFERENCE MANUAL
August 1973
IMP72 Version 1.5
IMP72 REFERENCE MANUAL PAGE 2
CONTENTS
1: Introduction.
1.1: Description.
1.2: Status.
1.3: How to Use IMP in One Easy Lesson.
1.4: Differences between IMP72 and the Previous PDP-10 IMP.
1.5: IMP72 on TENEX.
2: Programmer's Guide to IMP72
2.1: Conventions.
2.1.1: Lexical.
2.1.2: Expressions and Statements.
2.1.3: Machine Language Level Programming.
2.1.4: Program Structure and Scope of Variables.
2.1.5: Compiler Version Number Conventions.
2.2: The Expressions
2.2.1: Variables and Constants
2.2.2: Unary Operators and Functions.
2.2.3: Binary Operators.
2.2.4: Control Expressions.
2.2.5: Programs, Subprograms and Subprogram Calls.
2.2.6: Byte Access.
2.2.7: Input/Output.
2.2.8: Declarations; DATA and REMOTE statements.
2.2.9: Miscellaneous Constructs.
2.3: Syntactic and Semantic Extension.
2.3.1: The Easy Way - Syntactic "Macros"
2.3.2: Specifying Syntax.
2.3.3: Specifying Semantics.
2.3.3.1: Semantic Routines.
2.3.3.1.1: Calling Conventions for Semantic Routines.
2.3.3.1.2: Table of Semantic Routines
2.3.3.2: Conditional Semantics.
2.3.3.3: CASEs.
2.3.3.4: The VALUE Kludge.
2.3.3.5: Priority Semantics.
2.3.4: Syntactic Ambiguity and How to Make it Work for You.
2.3.5: Peaceful Co-Existence with Your Extensible Compiler.
3: How to Compile and Run IMP72 Programs.
3.1: Compiling Programs.
3.1.1: The Compiler Listing.
3.2: Compilation Error Diagnostics.
3.3: Loading and Running.
3.4: Making a New Compiler.
IMP72 REFERENCE MANUAL PAGE 3
4: Internal Documentation of the IMP72 Compiler.
4.1: Parsing.
4.2: Semantics.
4.3: Semantics and Code Generation.
4.4: How Extensibility is Implemented.
5: Distribution Proceedure for IMP72.
Appendix I: Library Utility Routines
Appendix II: Syntax of IMP72.
References.
The IMP72 compiler was designed and implemented at the Yale
University Department of Computer Science by Walt
Bilofsky(*), Institute for Defense Analyses, Princeton,
N.J., based largely on previous work of E. T. Irons.
Substantial contributions were made by Steven Weingart and
Terry Lyons. This manual reflects version 1.5 of the IMP72
compiler, which was written to run under version 5.06 of the
DECsystem-10 operating system, and which has been
successfully run under version 1.31.23 of TENEX.
------------
(*) Present address: Bolt Beranek and Newman Inc., 50
Moulton Street, Cambridge, Mass. 02138.
IMP72 REFERENCE MANUAL PAGE 4
1: Introduction.
imp (vt): ... (archaic) to eke out, strengthen.
imp (n): ... (archaic) an evil creature.
[Webster's 1967]
"Things used as language are inexhaustibly attractive."
[Emerson 1849]
" ... The red plague rid you,
For learning me your language!"
[Shakespeare 1612]
1: Introduction.
1.1: Description.
IMP is a simple higher-level language intended
primarily for system programming. It has been implemented
on the PDP-10 and CDC 1604 and 6600 computers [Irons 1970],
[Bilofsky 1972]. It is meant to provide language facilities
roughly at the level of FORTRAN II yet allow the programmer
the flexibility of machine language programming including
use of all the machine's registers and instructions and
arbitrary control of the program and data areas while the
program is running. IMP72 is a version of IMP for the
PDP-10 which provides the user with the following
facilities:
1. Extensibility. The user may specify extensions
to the syntax and semantics of the language in
forms ranging from simple "macros" to productions
which generate calls to compiler code-generating
routines. More efficient object code may easily
be specified for special cases.
2. Floating point capabilities. A REAL data type
and floating point arithmetic are provided.
3. Byte manipulation capability.
4. No reserved words in the syntax.
5. Syntactic error correction and admissibility of
ambiguous syntax.
IMP72 REFERENCE MANUAL PAGE 5
1.2: Status.
1.2: Status.
As of June 1973, the compiler has been in use at the
Yale Department of Computer Science for a full academic
year. It compiles itself. A version of the compiler which
generates object code for the PDP-11 computer exists at
Yale. The PDP-10 compiler is relatively large (37K minimum,
with up to about 50K needed to compile itself) and not
particularly fast (compiles at the rate of about 60 input
tokens per second), but it is felt that this is not
unreasonable for a compiler of the generality of IMP.
The IMP72 compiler is provided (courtesy of the Yale
Department of Computer Science) and maintained (as of this
writing) by the author, on a purely informal basis.
Although the author feels that the IMP72 compiler is
practically bug-free, and intends to maintain and update it,
the user is reminded that this does not constitute a
guarantee that it is or he will. Further details on where
to send complaints may be found in Section 5.
IMP72 REFERENCE MANUAL PAGE 6
1.3: How to Use IMP in One Easy Lesson.
1.3: How to Use IMP in One Easy Lesson
This section is intended to give the experienced
programmer the flavor of IMP so that he may write simple
programs without having to read the entire manual.
IMP is similar to FORTRAN II in power, and a bit like
ALGOL in flavor. It has no block structure or reserved
words, and all variables are global to all programs in a
compilation. Statements may be grouped, using parentheses.
Variables need not be declared unless they are arrays.
Arrays are declared by, e.g., FOO IS 50 LONG and subscripts
written, e.g., FOO[I+3]. The assignment operator is "←" and
the statement separator is ";". All binary arithmetic
operators are of equal precedence, and operations are
performed from right to left (as in APL). "←" is of the
same precedence as the arithmetic operators and also is
performed from right to left. Relational operators are
evaluated after arithmetic operators.
Statements may be labeled by prefixing them with
LABEL:. The transfer of control is GO TO LABEL and the
conditional is A=>B or A=>B ELSE C, where B and C are
statements or groups of statements (in parentheses), and A
is an expression (0 is false, non-zero is true), perhaps
containing a relational operator (EQ, NE, GE, GT, LE, LT).
The basic iteration construct is A FOR B IN C,D,E, where A
is a group of statements (in parentheses), B is a variable,
and C, D and E are expressions. A is executed for values of
B from C, incremented by D, to E.
An IMP source file consists of a number of statements,
in free format. The file is terminated by "%%". The last
statement executed should be FINI(0), which exits from the
program. To compile the IMP program on file FOO.IMP, call
the compiler by the monitor command RUN IMP (on TENEX, just
IMP), and when it types an asterisk, type FOO.IMP<CR>. To
execute the resulting object program on DECsystem-10, type
the command EX FOO.REL,/LIBRARY,IMPLIB.LIB. On TENEX,
execute by calling the loader subsystem with the command
LOADER, and then type FOO.REL,/LIMPLIB.LIB<ALTMODE>.
IMP72 REFERENCE MANUAL PAGE 7
1.4: Differences between IMP72 and the Previous PDP-10 IMP.
1.4: Differences between IMP72 and the Previous PDP-10 IMP.
IMP72 contains all features of old PDP-10 IMP with the
following exceptions:
1. Trace is not implemented.
2. Relational operators and A=>B ELSE C are
implicitly parenthesized differently in a few
cases. See sections 2.2 and 2.2.4.
3. The formal syntax of the language has other minor
changes which should not affect the compilation of
old IMP programs in IMP72. For example, A←B is
now permissible as a parameter of a subroutine
call. (The parameter passed is B.)
1.5: IMP72 on TENEX.
IMP72 was written to run under the DECsystem-10
operating system. Since TENEX supports most features of
DECsystem-10, it will be possible for TENEX users to make
use of IMP. Certain TENEX features, such as referring to
directories by name instead of project,programmer numbers,
will not be available.
IMP72 REFERENCE MANUAL PAGE 8
2: Programmer's Guide to IMP72
2: Programmer's Guide to IMP72
2.1: Conventions.
2.1.1: Lexical.
The language is basically free-form, subject to the
following constraints. Words representing variable names or
special words of the language are delimited by non
alpha-numeric characters including space. Therefore, spaces
may not appear inside words or numbers, and at least one
delimiter must appear between two such words. All
characters having an ASCII character code of 40B or less are
interpreted as blanks, except when enclosed in !, # or '
signs. Thus, string constants may contain any character,
including tabs and returns, and returns and line feeds
outside !, # and ' signs serve as delimiters. As a general
rule, make the text readable and you will find the words
properly distinguished.
Identifiers usually consist of alphanumeric strings,
but an identifier containing any special character (except
!) may be represented by enclosing the entire name in !
signs; e.g., !Funny*id.! (see Section 2.2.1).
All characters appearing between # signs are treated as
comments and ignored by the compiler (except for # signs in
string constants and between ! signs).
2.1.2: Expressions and Statements.
In IMP, no distinction is made between expressions and
statements. Almost everything is an expression, and has a
value, although a few expressions like GO TO X do not have a
very useful value. Two consequences of this philosophy are
that parentheses may be used for grouping what are usually
called statements in other languages, and that the statement
separator ';' becomes a binary operator (whose value is the
second operand).
The terms "statement" and "expression" will be used
almost interchangeably herein. When the term "expression"
is used, it should be understood that we may be concerned
with the value of the expression, whereas when the term
"statement" appears, we are interested primarily in the
effect of executing it, and only secondarily, if at all, in
its value.
IMP72 REFERENCE MANUAL PAGE 9
2.1.3: Machine Language Level Programming.
2.1.3: Machine Language Level Programming.
The identifiers 0R,1R,...,15R refer to the 16 registers
of the PDP-10: 0,1,...,15. Therefore you may write down
expressions which refer to particular registers. If you
write down an expression which can be evaluated with a
single instruction like '1R←1R+M', that instruction will
almost always be used to implement the expression.
Instructions which cannot be expressed in this way may be
implemented by user-provided syntax.
Statement labels are used exactly like variable names,
so that one can reference and change instructions in the
program. Thus 'T←200000000000B; T:GO TO T' puts the octal
instruction '200000000000' into the next word and executes
it. (We happen to know that GO TO T takes just one
instruction, and so use it to set aside a word to store in).
2.1.4: Program Structure and Scope of Variables.
Program structure is more or less like that of the
PDP-10 FORTRAN system. Thus a main program, if any, comes
first, then subroutines, if any, follow.
In contrast to the FORTRAN convention, all variables in
IMP are global to the entire set of subroutines compiled at
the same time, except that formal arguments of a subroutine
are local within that subroutine. That is, any two
references to the same variable name within a single IMP
compilation will refer to the same memory location, even if
the references are within different subroutines. However,
references to formal arguments of subroutines will have the
effect of references to the memory location containing the
actual argument in the call of the subroutine. Thus
references to formal subroutine arguments will not affect
variables of the same name in other subprograms in the same
compilation.
The declaration LOCAL is provided to defeat the fact
that variables are normally global to the entire module. An
identifier which is declared to be LOCAL will be replaced
throughout the subroutine in which it is so declared by a
unique name, which will not be used outside the subroutine.
The identifier will therefore appear local to the
subroutine.
IMP uses the FORTRAN subroutine calling conventions
(see sect. 2.2.5). Thus, IMP and FORTRAN programs may call
each other at will. However, the IMP and FORTRAN I/O
IMP72 REFERENCE MANUAL PAGE 10
2.1.4: Program Structure and Scope of Variables.
library subroutines have been known to conflict at times.
It is therefore advisable not to perform formatted I/O from
both IMP and FORTRAN subroutines in the same program.
2.1.5: Compiler Version Number Conventions.
The compiler version number and creation date appear on
its prompt and on the listings it produces. Version 1.5 is
release 5 of major version 1. A new release may differ from
the previous one by having language additions and/or fewer
bugs. A major version contains substantial changes. An
extension letter (e.g., 1.5(A)) appears in user-created
extensions of the compiler (see Section 3.1 for the method
of creating an extended compiler version using the C, U and
V switches). The date is the creation date of the current
copy of the version being run.
IMP72 REFERENCE MANUAL PAGE 11
2.2: The Expressions.
2.2: The Expressions.
The expressions of the language are listed here.
Expressions may be grouped explicitly by parentheses. Where
parentheses are omitted, the following is the order in which
subexpressions are evaluated:
1. Unary operators and functions.
2. Binary operators except ";" (but including
"←").
3. Relational operators.
4. Conditionals.
5. ";".
When two operators of equal precedence appear
consecutively in an expression, the rightmost is performed
first (as in APL). E.g., A*B+C is evaluated as A*(B+C).
EXCEPTION: the order of evaluation of relational operators
on the same level - e.g., A<B GE C - is not defined.
In "A; B", A is evaluated before B. In a conditional,
the implicand will be evaluated after the condition is
evaluated, if at all. Other than the restrictions listed
above, IMP places no restrictions on the order of evaluation
of arguments of operators.
In the following table, if no value is specified for an
expression, then the value of that expression is not defined
and/or not useful.
2.2.1: Variables and Constants
SAMPLE EXPRESSION MEANING
19576
Integer numbers (base 10).
777000B
Digits followed by 'B': octal constants
2AFB16
A string of digits and letters beginning with
a digit and ending in 'Bnn', where nn is a
IMP72 REFERENCE MANUAL PAGE 12
2.2.1: Variables and Constants
decimal number. These are constants in base
nn (called "flexadecimal" constants). The
letters and digits to the left of the last B
are interpreted as a constant in base nn.
E.g., 2ABB16 is the base 16 constant 2AB (683
decimal). 10100B2 is the binary constant
10100, or 20 decimal. The base may be
arbitrarily large, but only digits and the
letters A-Z may be used to represent digits.
(A-Z are the digits 10-35.) Note that a
constant starting with a letter is not legal.
for example, ABB12 is a variable name; the
desired constant may be written 0ABB12.
3.27"-5
Floating point constant. Consists of a
string of digits containing an embedded
decimal point (i.e., .01 and 3. are not
legal), and followed optionally by " and a
signed power-of-ten scaling factor.
A
Variable name. Any string of letters and
digits not interpretable as one of the above
forms of constants.
9R
Numbers 0 through 15 followed by 'R' name the
corresponding PDP-10 registers.
'ABCdef'
ASCII strings: stored left justified, zero
filled and spilling into successive words.
An ASCII string always terminates with at
least one zero character. The character ' is
represented within a string by two successive
quotes: ''. Thus, the string consisting of
one single quote would be written '''' (but
would print as '). The string '' is the same
as 0.
R'Stg'
A 1 to 5 character ASCII string prefixed by
the letter R is a constant containing the
ASCII characters stored right-justified. (R
may also prefix a string constant written as
!'Stg! - see following paragraph.)
!Any thing.!
Special identifier: any string of characters
not containing a '!', bracketed by '!'s. The
string of characters is considered to be a
single identifier or symbol. In particular,
IMP72 REFERENCE MANUAL PAGE 13
2.2.1: Variables and Constants
the following interpretations are made:
One special character is recognized as
itself. E.g., !+! is the same as +.
Any string starting with a single quote is
recognized as the ASCII string containing the
remainder of the string. E.g., !'Don't! is
the ASCII string constant "Don't".
Any other string is interpreted as an
identifier whose name is exactly the string
appearing between the ! signs. This allows
variable names containing special characters.
The user employing variable names containing
special characters is warned that the
compiler generates temporary labels prefixed
with a %, and syntactic class names in syntax
statements prefixed with a #. Use by the
user of identical names may produce anomalous
results.
A[E]
The E'th word of array 'A' starting from
zero. A is any variable name or constant,
and E is any expression.
[E]
The E'th word of memory: thus the contents of
the word whose address is E.
LOC(A)
The address of the memory location containing
A. A may be a simple or subscripted
variable.
2.2.2: Unary Operators and Functions.
-A
Negative of A (twos complement)
NOT A
Ones complement of A
(A)
Parentheses used to group expressions: value
is A. A may be any expression, including a
string of statements.
IMP72 REFERENCE MANUAL PAGE 14
2.2.2: Unary Operators and Functions.
2.2.3: Binary Operators.
A←B
Stores the value of B in A. The value of
this expression is the value stored in A. If
A and B are of different arithmetic types,
the value of B is converted to the type of A
before storing.
A<=B
Stores the value of B in A, without
performing any type conversion if A and B are
of different arithmetic types.
A; B
; is a binary operator whose value is B. It
is used as a statement separator. It always
evaluates A first, then B. Example: the
value of (A←5; B←3) is 3.
A+B
A-B
A*B
A/B
These are the arithmetic operators. They are
either integer (if both operands are of type
integer) or rounded floating point (if either
operand is real. If one operand is integer,
it is converted to real before the operation
is performed.). (Caution: when / is used
with register names as operands, in general
the divide is performed in a different
register than that holding either operand, in
order to avoid difficulties due to the divide
operation requiring two adjacent registers.
However, the expression R←R/S, where R is a
register name, is compiled to perform the
divide in register R.)
A//B
The remainder of A when divided by B. If
either A or B or both are floating point,
they are converted to fixed point before the
computation is performed. The result is
always integer.
A<B (or A LT B)
A>B (or A GT B)
A=B (or A EQ B)
A LE B
A GE B
A NE B
IMP72 REFERENCE MANUAL PAGE 15
2.2.3: Binary Operators.
Relational operators. These have the value
-1 if the relation holds, otherwise 0. The
words LE, GE and NE are used for the
operators 'less than or equal to', 'greater
than or equal to' and 'not equal to'. NOTA
BENE: Relational expressions must be enclosed
in parentheses wherever they appear except as
the condition in a conditional expression.
A OR B
A AND B
A XOR B
A EQV B
Bit-by-bit logical binary operators.
A LS B
A left shifted B bits, end off, zero filled.
A RS B
A right shifted B bits, end off, zero filled.
A LROT B
A left shifted circularly B bits.
A RROT B
A right shifted circularly B bits.
A ALS B
A left shifted B bits, sign extended.
A ARS B
A right shifted B bits, sign extended. (If B
is negative in the above instructions, the
direction of shift is reversed.)
2.2.4: Control Expressions.
A=>B
If A is not 0, evaluate B (Read as "A implies
B"). Thus, K=>X←1 does nothing if K is zero,
or sets X to 1 if K is non-zero.
X<Y=>(P←1;Q←2) changes P and Q if X is less
than Y, otherwise does nothing. NOTA BENE:
If A is a relational expression, it need not,
and for best object code should not, be
enclosed in parentheses. If A is a constant,
or an expression involving only constants,
then the code generated by the compiler for
the expression A=>B will consist either of B,
IMP72 REFERENCE MANUAL PAGE 16
2.2.4: Control Expressions.
if A is non-zero, or nothing at all, if A is
zero. This allows code to be compiled
conditionally if, for example, A is an
identifier defined by syntax to be a
constant. In the case that B is not compiled
due to A being a constant 0, any tags
contained within B will not be defined, but
declarations contained in B will take effect.
A=>B ELSE C
This expression has the value B if A is
nonzero, and C if A is zero. It will
evaluate only one of its operands each time
it is executed; thus it can be used to
replace the construct "A=>(B; GO TO FOO); C;
FOO:" if the user does not mind the value
being computed and ignored (which costs a few
extra instructions of object code). NOTA
BENE: The expression A←B=>X ELSE Y, e.g.,
will be interpreted as (A←B)=>X ELSE Y. To
assign the obvious interpretation to it,
write A←(B=>X ELSE Y). If C contains another
=>-ELSE clause, then C should be enclosed in
parentheses.
A FOR B IN C,D,E
A FOR B TO E
A FOR B FROM E
These expressions perform repeated execution
of the expression A with different values of
the variable B. The IN form executes A with
values of B equal to C, C+D, C+2*D,..., E;
the TO form for B equal to 0,1,...,E, and the
FROM form for B equal to E,E-1,...,0. In
every case the expression A is evaluated at
least once regardless of the values of C, D
and E. In the IN case when D is not a
constant, B must exactly reach the value of
E. E.g., in
I←2;
(A[J]←0) FOR J IN 0,I,9
the loop will never terminate. In all other
cases, the loop terminates when B passes the
terminal value. C, D and E may be any
expressions, but in the IN and TO cases D and
E will be evaluated once per iteration of the
loop. The FROM case generates the most
efficient code and is therefore to be
preferred. If control is tranferred from
within the loop, the value of the index
variable B is preserved. The value of B
IMP72 REFERENCE MANUAL PAGE 17
2.2.4: Control Expressions.
following normal termination of the loop is
undefined.
The special cases (A[I]←B[I]) FOR I
TO/FROM V, where A and B are integer arrays
and V is a single variable or constant, are
compiled as a block transfer instruction.
The user is cautioned that in the FROM case,
if A and B overlap, the code produced will be
incorrect. In more general cases, block
transfers may be specified by the MOVE
construct (Section 2.2.9).
WHILE A DO B
Repeatedly evaluates the expression A, and,
if it it is nonzero (true), evaluates the
expression B. If A is zero (false), control
passes to the next statement. If A is
initially zero, B is never executed.
A UNTIL B
This expression evaluates the expression A,
then evaluates B and if B is false (zero) it
repeats. Even if B is initially true, A is
always executed at least once.
GO TO E
Transfer control to the memory location E. E
will usually be a tag, but may be a
subscripted variable. for example, GO TO TAG
is equivalent to GO TO [LOC(TAG)].
GO TO (L0,L1,...,LN) E
This expression transfers control to the
label Li, where i is the value of expression
E, and L0,...,LN are identifiers. If E<0 or
E>N, the effect is undefined.
T:A
The expression A is tagged with the label T.
Two uses of this are: GO TO T transfers
control to this point in the program, and
references to the variable T will refer to
the first instruction word in the expression
A. To insert a tag where an expression does
not immediately follow, such as before a ')',
write TAG:0.
IMP72 REFERENCE MANUAL PAGE 18
2.2.5: Programs, Subprograms and Subprogram Calls.
2.2.5: Programs, Subprograms and Subprogram Calls.
A%%
An IMP program consists of an IMP expression
followed by two % signs. The % signs
terminate the input file as far as the
compiler is concerned.
SUBR A(B,C) IS D
Define a subroutine (or function) named A,
with formal arguments B and C. The value of
the subroutine is D, (unless the subroutine
is exited via a RETURN statement, q.v.). The
normal exit from the subroutine is by
"running off the end" of the expression D.
Example:
SUBR ABS(A) IS (A<0=>-A ELSE A);
is a subroutine whose value is the absolute
value of its argument. With this definition
P←ABS(Q) calls the subroutine ABS and sets P
to the value of Q. The arguments of a
subroutine "share" the memory location of the
actual arguments with which the subroutine is
called; that is, changing the value of one of
the parameters in the subroutine will change
the value of the variable in the calling
program which corresponds to that argument.
The linkage generated by the subprogram call
F(A1,A2,...,An) is
JSA 16,F
JUMP A1
JUMP A2
...
JUMP An
(return)
The subprogram refers to its i-th argument by
@i-1(16), and returns via a JRA @n(16).
Consequently, if a subprogram is called with
more arguments than it expects, no harm is
done, and if it is called with fewer, it will
return to a random place following the call
(unless it blows up trying to reference a
non-existant argument first). Functions
return their value in register 0R. This
calling sequence is compatible with FORTRAN
at the current time (June 1973). In the case
of compiling pure (re-entrant) code, the same
calling sequence is used, with a two-word
IMP72 REFERENCE MANUAL PAGE 19
2.2.5: Programs, Subprograms and Subprogram Calls.
entry block being relegated to the low
segment.
SUBR A() IS D
Define a subroutine (or function) named A
with no formal arguments, and with value D.
RETURN E
Return from the subroutine in which this
statement appears. The value returned by the
subroutine is E.
A(B,C,D)
Execute the function A with parameters B,C,D:
The value of this expression is the value of
the function. (Since there is no distinction
in IMP between functions and subroutines,
this expression is also used to call A as a
subroutine with the indicated parameters.)
A()
Execute the function A with no parameters.
2.2.6: Byte Access.
A<B,C>
This object is called a "byte". A is a
variable, either simple or subscripted; B and
C are any expressions. A<B,C> designates the
portion of the word A consisting of B bits,
located C bits from the right hand side of
the word. Thus, e.g., A<12,12> designates
the center 12 bits of A, A<1,35> is the sign
bit. A byte may be used as an operand, or
have a value stored in it. Example: the
program fragment
A←3;
(A<9,I+9>←3+A<9,I>) FOR I IN
0,9,18;
has the effect of setting A to 014011006003B.
The constructs A<R> and A<L> designate the
right and left half words of A. They are
equivalent to A<18,0> and A<18,18>
respectively.
IMP72 REFERENCE MANUAL PAGE 20
2.2.6: Byte Access.
BYTEP A<I,J>
This expression has the value of a PDP-10
byte pointer to the designated byte. Its
usefulness lies in the ability to write
X←BYTEP A<B,C> and do byte access by
referencing X, as explained in the next
paragraph. If I and J are not constants, the
byte pointer will be computed with their
values at the time the statement is executed
and will not be reevaluated if I and J change
value later.
<X>
X must contain a byte pointer. (This may be
accomplished by X←BYTEP A<I,J>.) Then <X>
refers to the byte designated by the byte
pointer.
<+X>
X must contain a byte pointer. When <+X> is
evaluated, it increments the byte pointer,
and its value is the byte then pointed to.
Incrementing a byte pointer to A<I,J> means
that the pointer will now point to A<I,J-I>
if J GE I, or to A[1]<I,36-I> if J<I.
Example: Suppose the memory locations
starting at STG contain an ASCII string
stored five 7-bit characters to a word, with
the rightmost bit empty. Then the following
program fragment unpacks STG into the
locations starting at UNP, one character per
word, rightjustified (remember, the ASCII
string terminates with a 0 character):
X←BYTEP STG<7,36>; I←-1;
LOOP: (UNP[I←I+1]←<+X>)=>GO TO LOOP
Byte pointers may also be stored into, with
or without first being incremented: <+X>←I.
2.2.7: Input/Output.
PRINT A,B,C
READ A,B,C
A, B, C, etc., are a list of expressions to
be transferred from/to input/output files,
with format specifiers mixed in. Any
expression may appear in a PRINT list; if an
expression other than a simple or subscripted
IMP72 REFERENCE MANUAL PAGE 21
2.2.7: Input/Output.
variable appears in a READ list the effect
will be to read into a temporary location.
The format specifiers are listed below.
The value of a PRINT statement is the
number of the last column on a line that was
printed into. In particular, the value of
PRINT / is 0. The value of a READ statement
is -1 if an attempt has been made to read
past the end of the file, 1 if an attempt has
been made to read past the end of a line, and
0 otherwise.
If no file is explicitly specified,
output is to file PO and input is from file
PI. There is no rule against using many
statements to print or read one line; the
only thing that terminates a line is a /. A
format, once specified, remains in effect
until another specification is encountered,
even through several different PRINT or READ
statements. Changing the PRINT format or
file does not affect the READ format or file,
and vice versa.
In printing, if a data item is presented
that is too large for the specified field
width, a field just large enough to contain
it is used. Consequently, if a field width
of 0 is specified, a field just large enough
to contain each data item will be used.
In reading, if the field width
specification is a single-character string
(e.g., ','), the input field will terminate
on the first of a) the specified character;
b) the end of the line; c) 128 input
characters. A field width specification of 0
is equivalent to ','. Blanks are ignored on
input (except in STG conversion and when ' '
is the field terminator). Tabs are converted
to a single blank on input. All fields are
limited to 128 characters on input.
Field widths in format specifiers are
usually constants but may be any expression.
It is necessary to close out your output
files explicitly in your program some time
before it terminates. This is done by the
subroutine call FINI(i), for i=0 or -1. This
closes all your files. If i=0, FINI exits to
the monitor; if i=-1, FINI returns normally.
IMP72 REFERENCE MANUAL PAGE 22
2.2.7: Input/Output.
You may also call FINI(NA,EXT,PN,PJ) where NA
& EXT are a specific file name and extension
in ASCII, and PN and PJ are a programmer and
project number. This causes FINI to close
out the specific file named and return
normally. If any of EXT, PN or PJ are zero,
the usual default case is assumed.
At most four files may be referred to in
IMP I/O. If it is desired to refer to more,
one of the previous files must be explicitly
closed using FINI (or close all of them with
FINI(-1)).
Format Specifiers:
IGR N
Conversion is in base 10, in a field N
positions wide. On output, leading zeros are
suppressed, and a leading minus sign is
printed for negative numbers.
OCT N
Conversion is in base 8, unsigned (i.e.,
negative numbers print with leading 7's), in
a field N positions wide. On output, leading
zeros are printed. If N is greater than 12,
exactly the leftmost N-12 positions will be
blank.
STG N
Conversion is in ASCII, in a field N wide.
On output, if N is 0 characters are printed
until a 0 byte is encountered. Otherwise,
characters are printed until a 0 byte is
encountered or until N characters have been
printed.
FLT M.N
Prints a floating point number in a field M
columns wide, with N digits to the right of
the decimal point. if N is 0, no decimal
point is printed. If the number will not fit
in the field as specified, scientific
notation is used. If M is negative,
scientific notation is always used in a field
-M columns wide. (IMP uses " to mean 'times
10 to the'.) Exception: if M is 0, a field
just wide enough to contain the number is
used. On input, FLT M.N is equivalent to IGR
M.
/
New line. On output, causes termination of
the current line. If a line is printed but
not terminated with a /, then the line will
not print. Successive /'s produce blank
IMP72 REFERENCE MANUAL PAGE 23
2.2.7: Input/Output.
lines. On input, a / causes the remainder of
the current line to be skipped over.
FILE A
FILE A.B
FILE A[P,R]
FILE A.B[P,R]
Specifies the file for input/output. A and B
are file name and extension, and may be
either ASCII string constants or
unsubscripted variables. P and R are project
and programmer numbers. If they are
constants, they are interpreted correctly if
and only if they are written in octal but
without the letter "B" suffixed. If they are
expressions, it is the programmer's
responsibility to take into account the fact
that project and programmer numbers are
octal. Caution: FILE FOR03.DAT, e.g., refers
to variables FOR03 and DAT. He who wrote
this probably meant to write FILE
'FOR03'.'DAT'.
Examples: A←101B; PRINT STG 1,A,IGR 4,A,OCT
6,A,A,/ produces the line
"A 65000101000101"
(PRINT IGR 3,I)FOR I TO 2;PRINT / produces
the line
" 0 1 2"
PRINT STG 0, 'This is a Page Heading',/
produces the line
"This is a Page Heading".
DEVICE D
Causes the next FILE specification (on either
READ or PRINT) to refer to a file on device
D. D may be either an ASCII string constant
or a variable containing a string value.
Default device is 'DSK:'. I/O to the
teletype is a special case. The teletype is
designated as the I/O device by DEVICE
'TTY:', and no subsequent file specification.
The next file specification will cause the
input or output to revert to that file.
Caution: device specification will only work
for TTY: or for directory devices having
physical records of 128 words, such as disk
and DECtape.
IMAGE MODE
This format specification causes subsequent
data transfer to be in unformatted mode,
transferring 36-bit words between the file
IMP72 REFERENCE MANUAL PAGE 24
2.2.7: Input/Output.
and the variables in the i/o statement list.
Mixing IMAGE MODE and other format
specifications in reading or writing the same
file may produce anomalous results.
TAB N
On PRINT only, causes the next character of
output to appear in column N (numbering
starting with 1). If printing has already
gone past column N, no action is taken. that
line.
FILL 'c'
On output only, causes leading positions in
all fields to be printed as the specified
character instead of being left blank. Thus,
for example, FILL '0' causes leading zeros to
be printed. This specification, unlike the
others, affects all output files, not just
the current one.
2.2.8: Declarations; DATA and REMOTE statements.
LET I=2R,A=3,BF=B[1]
THE LET statement provides a convenient way
of declaring synonyms. It consists of the
word LET, followed by a list of synonym
declarations of the form N=V, where N is a
name and V is a constant, or a simple or
subscripted variable. The effect is that all
subsequent appearances of N in the program
will be interpreted as if V had been written
instead(*).
A,B ARE 3 LONG,COMMON
1R IS REGISTER
FOO,3R IS RELEASED
Declarations: the general form for a
declaration is a list of names, then a list
of characteristics to be associated with
these names. The characteristics available
are:
------------
(*) LET A=B is in fact just "syntactic sugaring" for the
syntax statement <ATOM> ::= A ::="B".
IMP72 REFERENCE MANUAL PAGE 25
2.2.8: Declarations; DATA and REMOTE statements.
REAL
Specifies that the variables are real
numbers, and all arithmetic performed on them
is to be floating point. Type conversion
between real and integer is always done
implicitly.
n LONG
Where n is a constant or constant expression.
The variables are declared to be arrays, and
n words are reserved for each of them. (The
n words may be referred to as A (or A[0]),
A[1], ... A[N-1].)
COMMON
Variables are made common. They are assumed
to be defined in another program unless they
are declared n LONG or appear as a tag in
this program, in which case they are defined
here. (n may be 1 if desired.) Declaring a
variable to be COMMON will make it the same
as a FORTRAN common block with that name.
Other variables may be located in that block
by defining the variables in syntax
statements or LET statements.
LOCAL
Makes variables local to the current
subroutine from this point to the end of the
subroutine, but does not affect the meaning
of the identifier outside the current
subroutine. This declaration is also useful
for making local register assignments.
REGISTER
If A is not among 0R-15R, the declaration A
IS REGISTER binds A to a register selected by
the compiler, in the range 1R-13R. Until the
declaration A IS RELEASED is encountered, all
references to A will be replaced by
references to the register. If A is a
register name 0R-15R, this declaration warns
the compiler to avoid using it, because the
programmer intends to use it. If the
register is already in use, an advisory is
issued.
RESERVED
This declaration precludes the use of the
associated register by the compiler up to the
end of the program or until a RELEASED or
AVAILABLE declaration is encountered.
RESERVED is distinguished from REGISTER in
IMP72 REFERENCE MANUAL PAGE 26
2.2.8: Declarations; DATA and REMOTE statements.
that the latter only shields a register from
use by the compiler until the last time it is
referred to; RESERVED remains in effect for
the entire source program.
AVAILABLE
This declaration informs the compiler that
the associated register may be used by it for
computations. It does not affect the binding
between an identifier and a hardware
register, and the register will be reserved
again beginning at the next reference to it.
RELEASED
For identifiers which have been bound to a
hardware register, that binding is
terminated, and subsequent references to that
identifier will refer to a memory location.
The register becomes available to the
compiler. For hardware register names,
RELEASED is equivalent to AVAILABLE.
SCRATCH
Ordinarily, all registers which are reserved
at the point of a subprogram call are saved
before the call and restored afterwards.
This declaration signals the compiler not to
save a register.
PROTECTED
This declaration signals the compiler to
preserve through subprogram calls the value
of a register which had previously been
declared SCRATCH. PROTECTED is the default
mode.
REMOTE S
S is any statement. Has the effect of
causing the code for statement S to be
inserted not at the point in the program
where the REMOTE statement appears, but at
the end of the program, just before the
constants and variables. Useful in quoted
semantics where initialized local variables
are required, e.g., LOCAL FOO IN "... REMOTE
FOO:DATA(20); ...".
DATA (L)
L is a list of variable names and constant
expressions. The DATA statement produces
IMP72 REFERENCE MANUAL PAGE 27
2.2.8: Declarations; DATA and REMOTE statements.
data words at that point in the program
containing the values of the constant
expressions, and the addresses of the
variables, in L. One word is used for each
item in L, except that ASCII strings are
stored in as many words as are required.
DATA statements may be used to preset
variables to a value at compile time, viz.:
VAR: DATA (3). Avoid putting DATA statements
where they might get executed (unless you
really want to execute your data).
2.2.9: Miscellaneous Constructs.
CALL ME Ishmael
Ordinarily, the name of the source file is
the name which activates the DDT symbol table
for a compiled IMP72 program. This statement
overrides that name, and assigns the name
Ishmael for that purpose.
EXECUTE E
Executes the instruction contained in the
variable E.
CALLI(C,V)
Executes the DECsystem-10 UUO CALLI C, with
the AC value V. C must be a constant less
than 4096, or else this construct will
compile as a call to the subroutine CALLI
(which fortunately is in the library package
FORTIO.REL - see Section 5). The value of
this construct is the AC value returned by
the CALLI. In addition, if the error return
was taken by the CALLI, the variable CALLI
will have been set to -1 (but note that if
the error return was not taken, its previous
value will not have been changed).
XWD A,B
Useful occasionally in system calls. Has the
value B<R> OR A<R> LS 18.
IOWD A,B
Has the value XWD -A,B-1.
TWOSEG
This statement will cause reentrant (pure)
code to be produced by the program at the
IMP72 REFERENCE MANUAL PAGE 28
2.2.9: Miscellaneous Constructs.
head of which it appears. It is equivalent
to the compiler switch /R. It should be the
very first statement in the source file.
FIX(A)
Has the value of the expression A, converted
to type integer if it was not already of that
type.
FLT(A)
Has the value of the expression A, converted
to type real if it was not already of that
type.
MOVE A THROUGH N TO B
Generates a block transfer instruction, which
efficiently performs the operation
(B[I]←A[I]) FOR I TO N. N is any positive
expression (if real, it is fixed). A and B
are any expressions having an address, such
as simple or subscripted variables, or
[expression].
IMP72 REFERENCE MANUAL PAGE 29
2.3: Syntactic and Semantic Extension.
2.3: Syntactic and Semantic Extension.
This section explains the IMP72 facilities for
extending the language. It is possible to add productions
to the syntax for the language, defining the semantics for
the new constructs in several ways. Semantics may be
specified in terms of expressions written in the portion of
the language already defined, as explained in Section 2.3.1.
Alternatively, semantics may be performed by a series of
function calls to semantic subroutines contained within the
compiler, as documented in Section 2.3.3.1. The user may
use the subroutines provided, or, if he requires operations
not available from the current set of semantic subroutines,
he may as a last resort go into the compiler to add new
ones. Different semantics may be specified in special
cases, as explained in Section 2.3.3.2, either to perform
different operations upon objects in different contexts, or
to generate better object code. When the semantics for
several productions are very similar, they may be lumped
together under a general case, as noted in Section 2.3.3.3
and 2.3.3.4.
2.3.1: The Easy Way - Syntactic "Macros"
This section is an introduction to syntactic extension
in IMP72 for the casual user, and is intended to provide him
with sufficient information to utilize the basic facility
without informing to the point of total confusion. Some of
these features are in fact less restricted than indicated in
this section. The full truth emerges in subsequent
sections.
IMP72 contains a facility for defining syntactic
macros, or patterns which the compiler will recognize in a
program and generate specific code for. An example of a
definition for the absolute value function using a syntactic
macro is
<EXP> ::= ABS ( <A> ) ::= "A<0=>-A ELSE A"
This statement, when inserted in a program, will cause the
compiler to recognize the construct ABS, followed by a "(",
followed by any expression, followed by a ")", and to
substitute for it the code enclosed in " signs, inserting
the actual expression for each instance of A in the quoted
expression. Thus, writing
IMP72 REFERENCE MANUAL PAGE 30
2.3.1: The Easy Way - Syntactic "Macros"
X←ABS(R+3)
later in the program would be the same as writing
X←(R+3<0=>-(R+3) ELSE R+3)
A syntax statement fits the following pattern:
<EXP> ::= syntax part ::= semantic part
The syntax part is the pattern which the compiler is to
recognize. It consists of the names and special symbols in
the pattern, and, for each expression in the pattern, an
identifier in angle brackets: e.g., <FOO>. Single
characters in quotes (e.g., '#') are interpreted as that
character, without the quotes, and the characters : ; " <
and % must be quoted if they appear in the syntax part.
The semantic part consists of an IMP expression in
double quotes, perhaps preceded by a list of local
variables. It defines the code IMP is to generate when it
recognizes an instance of the syntax part, with the actual
expressions in the instance to be inserted in the quoted
expression in place of the identifiers which appeared in
angle brackets in the syntax part.
The quoted expression may be preceded with a list of
local variables. For example, notice that the definition of
ABS above computes A twice, which may be inefficient if A is
an expression. A more efficient way of doing it is
<EXP> ::= ABS ( <A> ) ::= LOCAL R IN "R IS REGISTER;
(R←A)<0=>-R ELSE R"
This definition computes A only once, storing it temporarily
in the register R.
Another example is taken from the syntax built into the
compiler:
<EXP> ::= <A> FOR <B> FROM <C> ::= LOCAL FOR IN "B←C;
FOR: A;
(B←B-1) GE 0=>GO TO FOR";
The list of local variables may consist of up to ten names,
separated by commas.
This concludes the introduction to syntactic macros.
IMP72 REFERENCE MANUAL PAGE 31
2.3.2: Specifying Syntax.
2.3.2: Specifying Syntax.
The form of the syntax specification statement is a
modification of the form of a BNF production (Naur 1963),
with alternative right-hand sides not allowed, and a
semantics definition added at the right. The general form
is:
<class> ::= syntax-part ::= semantic-part
We will use the term "arguments" of a production or
syntax statement to refer to the non-terminals appearing in
the syntax-part of a syntax statement (i.e., on the
right-hand side of the production).
The compiler interprets a program by recognizing
instances of certain syntactic classes, such as <EXP>
(expression), <VBL> (variable), etc. The function of the
syntax part of a syntax statement is to tell the compiler
about a new construct that it must recognize as an instance
of a certain class, specified by the identifier in <class>.
The classes of interest in the IMP language are:
<NAME>: Any variable name or constant. Should
generally be avoided in favor of <VBL> unless
the user is sure of what he is doing.
<VBL>: Any constant or simple or subscripted variable.
<ION>: Any <VBL>, function call, or an <STL> enclosed
in parentheses.
<ATOM>: An <ION> or byte of an <ION>.
<BYTE>: Any byte or byte pointer reference.
<EXP>: Class of most expressions.
<ST>: Includes <EXP>'s, conditional expressions,
declarations, etc.
<STL>: List of one or more <ST>s, separated by ;'s.
The full set of syntactic classes in IMP may be
determined by reading the syntax of the language (see
Appendix II). The user may define his own syntactic classes
as the humor falls upon him.
The syntactic part of a syntax statement may contain:
1. Identifiers and special characters, representing
themselves. The characters %, ", :, < and ; must
IMP72 REFERENCE MANUAL PAGE 32
2.3.2: Specifying Syntax.
be enclosed in single quotes.
2. Single characters enclosed in single quotes.
They are interpreted as the character alone.
3. Syntactic classes, represented by <CLASS,name>,
where CLASS is the name of the class and name is
an identifier by which to refer to this argument
of the production in the semantics.
4. <name>. This is interpreted as <EXP,name> unless
it is the very first thing after the ::=, in which
case it is interpreted as <ATOM,name>. This has
the effect of making operations defined using
<name> be right-associative, in keeping with the
IMP convention.
Example: In section 2.3.1, an expurgated version of the
syntax of FROM loops was given. The adult version
illustrates the use of syntactic classes:
<ST> ::= <EXP,A> FOR <VBL,B> FROM <EXP,C> ::=
LOCAL FOR IN "B←C;
FOR: A;
(B←B-1) GE 0=>GO TO FOR"
The interpretation of the syntactic part of this
statement is that the compiler is henceforth to recognize as
an <ST> the construct consisting of any <EXP>, followed by
the word FOR, followed by any <VBL>, followed by the word
FROM, followed by any <EXP>.
Example: This example illustrates the use of
user-defined syntactic classes and recursive definitions.
The construct to be defined is GO TO (L0,L1,..,LN) I which
transfers control to Li where i is the value of I. (If I<0
or I>N, the effect is undefined.)
<ST> ::= GO TO (<GOLIST,A>) <B> ::=
LOCAL GO IN "GO TO GO[B];GO: A";
<GOLIST> ::= <NAM,A> ::= "GO TO A";
<GOLIST> ::= <GOLIST,A>,<NAM,B> ::= "A; GO TO B"
Then the expression
GO TO (A,B,C) J-3
produces
GO TO %GO1[J-3];
%GO1: GO TO A;
GO TO B;
GO TO C
IMP72 REFERENCE MANUAL PAGE 33
2.3.2: Specifying Syntax.
(%GO1 is a unique name generated by the compiler for the
local variable GO.) This example depends on the fact that a
GO TO V generates exactly one machine word of code if V is a
variable.
It is possible to write a syntax statement with no
semantic part:
<class> ::= syntax-part
The semantics implied by this syntax statement is to discard
all arguments of the production except the first. If there
are no arguments or one argument, no information is lost.
IMP72 REFERENCE MANUAL PAGE 34
2.3.3: Specifying Semantics.
2.3.3: Specifying Semantics.
2.3.3.1: Semantic Routines.
In Section 2.3.1 we saw that the semantic part of a
syntax statement could consist of quoted semantics - i.e.,
an IMP expression enclosed by double quotes. There is one
alternate format for semantics: a functional expression
consisting of calls to semantic subroutines.
Semantic subroutines are normal compiler subroutines
which are also available to the syntax writer in a limited
sense. They are called with arguments of the following
types:
1. Constants. Usually small positive numbers.
These can specify opcodes, switches, or whatever,
depending on the routine.
2. Negative directory indices. Identifiers are, by
convention, passed to semantic routines as the
negative of their index in the directory.
3. Registers. These are in the form of indices in a
table of symbolic registers. Assignments to
actual machine registers are made during the
assembly phase.
4. Objects.
5. Two arguments connected by a + sign. The value
is the sum of the arguments. a syntactic
ambiguity diagnostic will result from an argument
of the form arg+arg+arg; the diagnostic may be
ignored.
Objects are pointers into a stack which holds the
expressions being processed by the semantics of the current
production. An object corresponds to some <class,name> in
the production an instance of which is being compiled.
Objects either are identifiers from the program being
compiled, or have been constructed by semantics operating on
and/or combining other objects.
Objects come in five flavors: Name, Register, Constant,
Variable and Memory: All but Name will have an arithmetic
type (real or integer), and may have associated with them
some object code which computes the expression of which the
object is the value.
IMP72 REFERENCE MANUAL PAGE 35
2.3.3.1: Semantic Routines.
Name - Has no properties other than a name. Is turned
into another type of object by semantic routine
NAME (q.v.). Names will not pop up as operands;
they will have been passed through NAME on their
way to becoming <VBL>s.
Register - Has no properties other than a register. A
Register object usually designates a register
holding the result of a computation.
Constant - May be a constant from the source program
(in which case it has a name), or a computed one.
In any event, it has a value.
Variable - Designates a simple variable, with maybe a
constant subscript. No other properties.
Memory - Designates some address in memory too
complicated to be a Variable. May have any or all
of a name, a constant subscript, an index register
(containing the value of a computed subscript),
and even an indirect bit.
IMP72 REFERENCE MANUAL PAGE 36
2.3.3.1.1: Calling Conventions for Semantic Routines.
2.3.3.1.1: Calling Conventions for Semantic Routines.
The semantic part of a syntax statement may consist of
a call to a semantic routine, in standard IMP subroutine
call format. The arguments of the call may be either:
1. Constants, unsigned, less than 30 bits. The
argument passed to the semantic routine is that
number.
2. Identifiers appearing as names of instances of
classes in the production (e.g., A in the example
of Section 2.3.1). The argument passed to the
semantic routine is a "stack pointer" to the
object recognized as that part of the production.
3. Identifiers not appearing as names of instances
of classes. The argument passed to the semantic
routine is the negative of the directory index of
the name.
4. Another call to a semantic routine. The argument
passed is the value of the semantic routine.
5. Any two of the above connected by a + sign. The
argument passed is the indicated sum.
Although a semantic part consisting of calls on
semantic routines looks like part of an IMP program, there
are certain conventions which the semantics writer may take
advantage of here. Semantic routines may have up to 10
arguments. Arguments are evaluated strictly left to right
within one function call. If more or fewer arguments are
provided than the routine demands, no difficulty is
encountered (except that a reference to a missing argument
will refer to a random location in user core).
Therefore, although the ";" operator is not provided in
the syntax for semantic routine calls, its effects often can
be achieved through the use of arguments which will be
evaluated but not used. For example, the effect of
F(A,B); G(C,D,E)
may be obtained by
G(C,D,E,F(A,B))
if it does not matter that in the latter case C, D and E are
evaluated first.
IMP72 REFERENCE MANUAL PAGE 37
2.3.3.1.1: Calling Conventions for Semantic Routines.
The following is an example of a syntax statement using
semantic subroutines to specify semantics. It is suggested
that the reader interpret the semantic routine calls using
the table below. 214B is the PDP-10 opcode for Move
Magnitude.
<ATOM>::= ABS ( <ATOM,A> ) ::= DEWOP(214B,AREG1(1,15B),A);
IMP72 REFERENCE MANUAL PAGE 38
2.3.3.1.2: Table of Semantic Routines
2.3.3.1.2: Table of Semantic Routines
In this table, the types of arguments expected are
indicated as follows:
S, T, U: Objects (other than Names except where
specified).
R, P : Registers.
I, J, N: Constants or (when specified) negative
directory indices.
* Indicates a routine which is object
machine-dependant (i.e., which
must be altered if code for
another computer is to be
generated.)
NAME AND ARGUMENTS EFFECT AND RESULT RETURNED.
ADDOP(I,S,T)
*Performs a binary operation on S and T.
It is defined as HOOK(S,T,
DEWOP(I,REGOF(FETCH(T)),S)). This could
be written in semantics but using ADDOP
saves compiler table space and is clearer
besides.
ADDR(S)
*Makes S a Memory-type object whose
address is what was previously its value.
The use of this is that whereas
DEWOP(I,R,S) compiles an instruction with
the address of S in the address field,
DEWOP(I,R,ADDR(S)) compiles an instruction
with the value of S in the address field.
Example: <EXP> ::= <ATOM,I> LS <EXP,J> ::=
ADDOP(242B,ADDR(J),I) ADDR is smart enough
to take any object (except Names) as S.
Result of ADDR is S.
AREG(I)
Returns the register index for hardware
register I.
AREG1(I,J)
Returns a brand-new register between
registers (I AND 37B) and (J AND 37B)
inclusive. If I has the 40B bit set, the
register is the returned value of a
subroutine and may be moved to make room
for another such. If J has the 40B bit
set, reserve two consecutive registers and
IMP72 REFERENCE MANUAL PAGE 39
2.3.3.1.2: Table of Semantic Routines
return the first (see also REG2). If I
has the 100B bit set, this register is
talked about explicitly by the user and he
doesn't want any old compiler going around
altering its value implicitly.
BYTEP(S,T,U)
*S is a Variable-type object. Makes a
byte pointer for S<T,U>, and puts it off
at the end of the program with REMOTE
(q.v.). Result is S, whose value is now
the (variable containing the) byte
pointer. Good things to DEWOP on S are
byte instructions.
CONOP(S,T,I,U)
Performs an operation designated by I on
two Constant-type objects S and T. See
file SYNTAX or the source for CONOP for
the codes for I. Feel free to add a few
more operations if it will generate better
code. U is the implicand in cases of CON
RELOP CON=>U. Result is S, with code for
T hooked in first if there is any.
COPY(S)
Produces a copy of the object S, including
a copy of the code associated with it, if
any. Useful if S is to be hooked in in
two different places in code to be
generated. Care must be taken to use the
copy of S in the place which is hooked in
first, since trying to copy an object that
has already been hooked in may imperil the
internal tranquility of the compiler.
DATAST(S)
For use in DATA statements. S is any old
object; it gets clobbered. Unpacks the
current list (see ENLIST), and generates
code for the DATA statement whose
constants and variables are in the list.
DECL
DECLARE
Special semantic routines to implement
declarations.
DEWFUN(S,I,J)
Adds the code to the object S which
instructs the assembly pass to perform
special function I, with argument J.
IMP72 REFERENCE MANUAL PAGE 40
2.3.3.1.2: Table of Semantic Routines
DEWOP(I,R,S)
*Performs machine opcode I upon object S
and (if nonzero) register R. This is a
very smart routine and is happy with any
object (except a Name) as S. If possible,
it will use an immediate instruction for
constant operands. Caution: If DEWOP is
used to implement a binary operator (as in
DEWOP(OP,REGOF(T),S), the code for T must
be hooked on to the code for S after the
DEWOP is performed, otherwise it will be
lost. See ADDOP. Result of DEWOP is S.
If R is nonzero, S is now a Register
object, with value in the register
designated by R. Normally, if the result
of the instruction generated is to
destroy, as a side-effect, the value in R,
and R is a user-specified register (as in
the expression 5R LS 8), a move is
inserted to get the value of R into a
scratch register first. This move may be
suppressed by adding 1000B to the opcode
I.
ENLIST(S)
Places the object S at the bottom of the
current list. ENLIST is part of a general
recursive list mechanism for stacking up
lists of things to be fed all at once to
some semantic routine. NEWLIST(S) creates
an empty list, and, if S is nonzero, puts
the object S on it. GETLIST(S) is a
non-semantic subroutine which unpacks the
current list on a first-in-first-out
basis. If the list is not empty,
GETLIST(S) puts the top object into the
object S (clobbering the old value), and
returns a non-zero value. If the list is
empty, GETLIST returns 0, and reopens the
list which was current at the last call to
NEWLIST. See DECLARE (file IMPSEM) for an
example of a semantic routine which uses
GETLIST. The result of ENLIST is 0.
ENSTACK(I)
Creates an object of type Name, where I is
the directory index (either positive or
negative) of the name. For example to
make the object corresponding to the
variable VAR, write NAME(ENSTACK(VAR)).
ERROR(N,V)
Produces the compiler error message which
IMP72 REFERENCE MANUAL PAGE 41
2.3.3.1.2: Table of Semantic Routines
is the name of the variable V. N is 0, 1
or 2 for a fatal, ordinary, or advisory
error respctively. For error messages
containing spaces, etc., enclose the
message in ! signs to make it into a
variable name. Example: ERROR(2,!You
Goofed.!).
FETCH(S)
*Forces the object S to be of type
Register (i.e., loads it into a register).
Result is S.
FETCH2(S)
*Forces the object S to be of type
Register, when double registers are being
used. Does a fetch into the first
register of a newly reserved register pair
unless S is programmer-defined register,
in which case nothing is done. Result is
S.
FIX(S)
*Generates code to convert S from floating
point to integer. Value is S.
FLOAT(S)
*Generates code to convert S from integer
to floating point. Value is S.
FREEZE(S)
Flags the code associated with S so that
the assembly phase will not optimize out
any MOVE instructions in it. Useful when
generating skip instructions. Result is
S.
HOOK(U,S,T)
Hooks the code for T on after the code for
S. U gets this code, and the value of T.
U may be one of S or T (and usually is).
HOOK thus performs the ";" operator, among
its other uses. It must be used whenever
two or more arguments which might have
code attached appear in a production,
except in the few cases where another
semantic routine invokes HOOK implicitly.
The result of HOOK is U.
MAXWELL
MAXEND
Specialized subroutines to handle the job
of switching parser output into a
IMP72 REFERENCE MANUAL PAGE 42
2.3.3.1.2: Table of Semantic Routines
temporary array rather than feeding it to
the code generator. Used to implement
quoted semantics.
NAME(S)
S is a Name-type object. NAME turns it
into a variable, adding an arithmetic
type, and making things right if S is a
register or subroutine parameter. Result
is S.
NEWLIST(S)
See ENLIST.
OJUMPOP(I)
*I is an M-field for a 300-series opcode
(conditional) on the PDP-10. Thus I
specifies a relational operator. OJUMPOP
returns the M-field for the negation of
that operator. If I has the 10B bit set,
OJUMPOP returns the M-field for the
reverse negation (the reverse of >, for
example, is <.)
PAR(I)
Used in CASE semantics to refer to the Ith
argument (counting from 0) of the CASE
argument list (see section 2.3.3.3.).
PRINCAL
PRINPAR
*Specialized routines handling semantics
for PRINT and READ PRINPAR statements.
REGOF(S)
Result is the register of the object S, or
0 if it doesn't have one.
REG2S(S)
If S is of type register, and the register
was assigned by AREG1 so as to reserve two
consecutive registers, then the value of S
becomes the contents of the second
register. S is otherwise unchanged. If S
is not as specified, undefined things
happen. The value of REG2S is S.
REMOTE(S)
Causes the code associated with S to be
inserted at the end of the compilation
instead of at the current point. Result
is S, but with no code. (This may lead to
incorrect results if an attempt is made to
IMP72 REFERENCE MANUAL PAGE 43
2.3.3.1.2: Table of Semantic Routines
compute with the value of S.)
REPVAL(S,T)
Part of the semantics for syntax.
Generalizes the VALUE class S within the
quoted semantics T (see Section 2.3.3.4).
RETURN(S)
*Generates code to return from the current
subroutine (JRA) with S as the value of
the subroutine. Result is S, not that it
matters.
STACKUP(I)
I is either a negative directory index or
a constant. This routine is used to
create objects and place them on the
stack. (If I is a constant, the constant
is first entered in the directory, so that
a directory index for the item exists.)
NAME(ENSTACK(I)) is then performed to make
the object. Result is stack index of the
item.
SETPRI(I)
Part of the semantics for syntax. Sets
the designated priority bits in the
semantics being generated.
STACK(I)
Returns the Ith argument of the
production. Used to refer to arguments
which are VALUE classes in default
semantics (i.e., semantics which are not
conditional.)
STORE(S,T)
*Stores the value of T in S, hooking the
code for S on after that for T. Result is
T.
SUBBEG(S)
*Generates code for the beginning of
subroutine S, where S is a Name-type
object. Result is S, with the code
attached.
SUBPR0(I)
Called with I=0 at the beginning, and I=1
at the end, of every subroutine call.
Keeps track of the level of calls, and
resets the index of the next temporary for
subroutine arguments to 0 whenever it gets
IMP72 REFERENCE MANUAL PAGE 44
2.3.3.1.2: Table of Semantic Routines
up to level 0.
SUBRCALL(S)
*S must be of type Variable. Generates a
subroutine call to S (JSA 16,S), plus code
to declare S common, and to save any
registers in use at the time in
temporaries. (Registers are restored by a
DEWFUN(T,2,REGOF(T),SUBPR0(1)), where T is
the complete subroutine call with
arguments. The result of SUBRCALL is S,
whose value is now register 0R, which is
where subroutines return values.
SUBRPAR(S,T)
*S is a subroutine call or some similar
thing being built, and T is an argument.
A word of code containing a JUMP A, where
A is the address of T, is added on to the
end of S. If T has no address, A is the
address of a temporary containing the
value of T. The code for T, and any
additional code necessary to get the JUMP
A up to specs, is hooked on to the
beginning of S. The result is S.
SUBSCRIPT(S,T)
*Generates code for the Variable- or
Memory-type object S[T]. Is smart about
not computing constant subscripts, etc. S
and T may be any type except name. Result
is S, with the code for T hooked on
before.
SVAL(I)
Creates an object whose VAL is I. (See
also VAL).
SWITCH(I)
Turns on the compiler switch which is the
Ith letter of the alphabet.
TAG(S)
S had better be an object of type Name, or
Variable without a subscript. Adds to S
the code which defines the tag S at that
point. Result is S.
VAL(S)
Returns the low order 18 bits of the value
of the object S. If S in a constant, then
this is the low 18 bits of its value. If
S is the result of a VALUE semantics, then
IMP72 REFERENCE MANUAL PAGE 45
2.3.3.1.2: Table of Semantic Routines
this is the value specified there. If S
has had a value set by SVAL, then this is
that value.
VALU(S,T)
Part of the semantics for syntax. Defines
the current semantics to be VALUE S OF T,
where S is a Constant object and T is a
Name object.
In addition to the above, there are a number of
semantic routines which define the semantics for syntax, all
on file RSYN. See Section 4.4 for documentation.
IMP72 REFERENCE MANUAL PAGE 46
2.3.3.2: Conditional Semantics.
2.3.3.2: Conditional Semantics.
We saw above that semantics may consist of either a
semantic routine call or quoted semantics. It is possible
to invoke one of a number of semantics for a given syntactic
production, depending on the particular case of that
production being compiled. First an example:
<ATOM>::= NOT <ATOM,A> ::= "NOT B0CON"=>CONOP(B,0,6) ELSE
DEWOP(460B,AREG1(1,15B),A);
The semantic part of this syntax statement invokes different
semantics for the special case of NOT-constant.
Semantics may consist of a number of alternatives,
separated by the identifier ELSE. One alternative may
consist simply of a semantic routine call or quoted
semantics: this is the semantics executed if none of the
special cases obtain. The other alternatives are of the
form
condition => semantics
where the semantics is a semantic routine call or quoted
semantics, and the condition is an IMP expression which is a
case of the production, enclosed in double quotes. The
arguments of the semantics should be identifiers which
designate arguments in the condition. Referring to
identifiers which designate arguments in the syntax part
will not give a diagnostic but might produce undesired
results.
The condition is satisfied if the expression being
parsed matches the expression in the condition. Constants
in the condition will match any constant with the same
value, even if the names are different. (But constant
expressions in the condition are not evaluated. Thus, "-18"
in a condition would only match in cases where a minus sign
was actually used before a constant with value 18. One
should trust the compiler to perform constant arithmetic,
and write 777777777756B instead, which will match any
constant expression which evaluates to -18.) Variable names
will match any <EXP>, unless they are tagged with modifiers.
If the same name appears twice or more in the condition, the
expressions matching all occurrences must be identical.
Modifiers may be added to the first occurrence of a
given variable name in a condition by adding the character
'0' to the name, followed by the modifiers. Modifiers all
consist of three letters,sometimes followed by one or two
twodigit octal numbers. For example, A0CON, FOO0REG0115,
I0IGRVARMEMREG. The modifiers are:
IMP72 REFERENCE MANUAL PAGE 47
2.3.3.2: Conditional Semantics.
Arithmetic type modifiers (Identifiers with neither of these
match either type):
IGR: Identifier matches only objects of type integer.
FLT: Identifier matches only objects of type real.
Object type modifiers (Identifiers with none of these match
any object type; identifiers with several match
any of the indicated types):
REG: Matches objects of type Register.
REGii: Matches objects guaranteed to be in register ii.
REGiijj: Matches objects guaranteed to be in a register
between ii and jj. Range for ii and jj is 0-37
(octal).
CON: Matches objects of type constant.
CONii: Matches objects which are constant, and zero
except for the ii rightmost bits.
CONiijj: Matches objects which are constant, and zero
except for the ii bits starting jj from the right
end of the word. Range for ii and jj is 0-77
(octal).
CNG: Same as CON (and may have ii or iijj or attached),
but matches objects which are constant in the
designated field, with all bits outside that field
set to 1. CON and CNG should not both be used to
modify the same identifier.
VAR: Matches objects of type Variable.
MEM: Matches objects of type Memory.
Code modifier:
WRD: Matches objects with exactly one word of code
associated with them. May not always succeed when
one might think it should, due to the order in
which objects are combined by the code generator,
but will always fail when it should.
Numerous examples of conditional semantics may be found
on file SYNTAX, which contains the syntax for IMP72.
Care should be taken to specify special cases as cases
of the proper production. For example, suppose one wished
to refer to the right 18 bits of a variable by the construct
/variable/. This might be accomplished by
<VBL> ::= / <VBL,A> / ::= DEWOP(550B,AREG1(1,13),A)
which fetches the right half of A into a register by a Half
Right to Right, Zeros instruction. If one wanted also to be
able to write
IMP72 REFERENCE MANUAL PAGE 48
2.3.3.2: Conditional Semantics.
/variable/ ← expression
and store the value of the expression in the right half of
variable, one might then write
<VBL> ::= / <VBL,A> / ::= DEWOP(550B,AREG1(1,13),A)
ELSE "/A/←B" => HOOK(A,B,DEWOP(542B,REGOF(FETCH(B)),A)
(542B is a Half Right to Right Memory instruction.) But
/A/←B is a special case not of this production but of the
producton for "←". The above syntax statement would produce
an error diagnostic. The correct definition would be
<VBL> ::= / <VBL,A> / ::= DEWOP(550B,AREG1(1,13),A);
<EXP> ::= <VBL,A> ← <EXP,B> ::=
"/A/←B" => HOOK(A,B,DEWOP(542B,REGOF(FETCH(B)),A)
This would produce an advisory diagnostic, since the syntax
in the second statement duplicates a production already in
the language, but this is permitted in order to allow
defining additional special cases of semantics, exactly as
is done here.
IMP72 REFERENCE MANUAL PAGE 49
2.3.3.3: CASEs.
2.3.3.3: CASEs.
If two or more productions have exactly the same
syntax, except for terminal symbols (i.e., they contain the
same syntactic classes in the same order, and are instances
of the same class), and if their semantics are defined by
identical semantic routine calls, with perhaps only a few
constants changed (e.g., different opcodes), then it is
possible to define a general case, and define the semantics
of each production as a special case of it.
Example:
<EXP> ::= <ATOM,I> LS <EXP,J> ::= CASE (242B,514B,554B) OF
SHIFTS (ADDOP(PAR(0),ADDR(J),I)
ELSE "A0VARMEM LS 18"=>DEWOP(PAR(1),AREG1(1,15B),A)
ELSE "A0VARMEM LS 777777777756B"=>
DEWOP(PAR(2),AREG1(1,15B),A))
ELSE "A0CON LS B0CON"=>CONOP(A,B,5);
<EXP> ::= <ATOM,A> ALS <EXP,B> ::=
CASE (240B,514B,574B) OF SHIFTS;
<EXP> ::= <ATOM,A> LROT <EXP,B> ::=
CASE (241B,204B,204B) OF SHIFTS;
The first statement defines the syntax for I LS J. The
semantic part defines the case SHIFTS to consist of the set
of semantic alternatives within the parentheses following
the identifier SHIFTS. (Notice that the A0CON LS B0CON
alternative is not part of the SHIFTS definition but is an
alternative to it.) The list of constants following the
identifier CASE is the "parameter list" for this particular
instance (LS) of the case SHIFTS. These constants are
referred to in the semantics for SHIFTS by PAR(0), PAR(1),
and PAR(2).
The other two statements define the syntax and
semantics for two other kinds of shifts. The semantics is
defined by invoking SHIFTS with different parameter lists.
The semantics are the same, but PAR(i) will refer to the
i-th element of the appropriate parameter list.
A case, such as SHIFTS, may not be defined in more than
one place. It is not necessary to have parameter lists be
the same length for different instances of a case, but it is
the user's responsibility to insure that parameters that are
not supplied are not referred to. In any event, parameter
lists must always contain at least one constant.
Several CASE semantics may appear as some or all of the
alternatives in a semantic part. A CASE semantics may not
be the semantics part of a conditional semantics alternative
(i.e., "condition"=>CASE (...) OF FOO is illegal.)
IMP72 REFERENCE MANUAL PAGE 50
2.3.3.4: The VALUE Kludge.
2.3.3.4: The VALUE Kludge.
Occasionally, it is necessary to refer in conditional
semantics to an argument of the production which can not be
written as an identifier. This is not possible using the
mechanisms defined so far. The VALUE kludge is a way to
accomplish this. Take for example the syntax for
conditionals. Relational operators are defined as follows:
<RELOP> ::= NE ::= VALUE 6 OF EQ;
<RELOP> ::= '<' ::= VALUE 1 OF EQ;
and so on. The semantics specifies that these productions
are special cases of a class of productions labeled EQ, and
gives an 18-bit value to be associated with the particular
production.
In syntax statements involving the class <RELOP>, it is
possible to specify that any case of EQ may be recognized as
matching a particular element in a semantic condition. This
is done as follows:
1. In quoted unconditional semantics: prefix the
quoted part by EQ/. The corresponding instance of
RELOP in the syntax part must be named EQ. (Last
line in the example below).
2. In semantic conditions: Prefix the condition by
EQ/.
3. In semantics: The element may be referred to in
quoted semantics by prefixing the quoted part by
EQ/ as in (1). In semantic routine calls, the
element may be referenced by STACK(i), where the
element comes to the right of exactly i
identifiers in the condition (or, in unconditional
semantics, i arguments of the production).
The particular instance of EQ may be determined by
referring to VAL(STACK(i)), with i as in (2) above. The
value of this semantic routine call will be the value
specified in the VALUE semantics defining the instance of
<RELOP> in the expression being compiled.
Example:
<ST> ::= <EXP,A> <RELOP,EQ> <EXP,B> => <ST,C> ::=
EQ/"P0REG=0=>GO TO
S0VAR"=>HOOK(P,P,DEWOP(320B+VAL(STACK(1)),
REGOF(P),S)) ELSE
LOCAL IF IN EQ/"NOT(A=B)=>GO TO IF; C; IF: 0";
IMP72 REFERENCE MANUAL PAGE 51
2.3.3.5: Priority Semantics.
2.3.3.5: Priority Semantics.
Usually, the semantics for an expression is not
executed until enough of the context of the expression
has been parsed in order to determine the particular
special case involved. If it is known that no special
cases involving subexpressions of a production exist,
or if it is desired for other reasons to defeat the
special case matching mechanism and invoke the
semantics for a production immediately it is
recognized, the PRIORITY semantics is used.
A PRIORITY semantics is a semantic part consisting
entirely of
PRIORITY n s
where n is a digit in the range 0-7, and s is a
semantic part containing no conditional semantics or
ELSE alternatives.
n is interpreted as a three-bit mask, and the
semantics s for the production is executed immediately,
bypassing the special case matching process, in the
event that the current priority mask of the matching
routine and n have any bits set in the same position.
The priority mask of the matching routine is set
to k by the semantic routine SETPRI(k). The
interpretations of the bits are:
1. Normally set. Used to force immediate semantic
execution under normal circumstances.
2. Set during interpretation of double-quoted
expressions. In this mode, semantics are not
interpreted unless they are PRIORITY 2. The
output of the parser is stored instead, for use as
data by the compiler. However, PRIORITY 2
semantics are executed, providing a way to
terminate this state.
4. Set during synonym processing (LET statement).
IMP72 REFERENCE MANUAL PAGE 52
2.3.4: Syntactic Ambiguity and How to Make it Work for You.
2.3.4: Syntactic Ambiguity and How to Make it Work for You.
The IMP72 parser allows ambiguity in the syntax. For
example, notice that in many examples above, we present
different definitions of the construct ABS(E). The syntax
allows two interpretations of this construct: the special
construct we are defining, or a call on the subroutine ABS.
The parser notes the two interpretations, and chooses
between them, as follows:
1. If an identifier is an expression in one
interpretation, and a terminal symbol (i.e.,
appears as itself in the production, such as ABS
in the example of Section 2.3.1), in the other
interpretation, the interpretation with the
terminal symbol is chosen.
2. Otherwise, the interpretation involving the
smallest number of different syntax rules is
chosen.
3. Otherwise, the compiler makes an arbitrary
choice.
Thus, referring to the example in Section 2.3.3.2 of
the construct /variable/, the special case
/variable/←expression was defined by a special case in the
semantics. It would alternatively be possible to define the
case by introducing a deliberate syntactic ambiguity, as
follows:
<VBL> ::= / <VBL,A> /
::= DEWOP(550B,AREG1(1,13),A);
<EXP> ::= / <VBL,A> / ← <EXP,B>
::=
HOOK(A,B,DEWOP(542B,REGOF(FETCH(B)),A)
The expression /X/←5, e.g., could be recognized in two
ways, by using the first syntax statement above plus the
rule for <VAR>←<EXP>, or by using the second syntax
statement above. But according to rule 2, the parser
chooses the second interpretation, as involving one syntax
rule as opposed to two rules for the first interpretation.
This is precisely the choice that is desired.
IMP72 REFERENCE MANUAL PAGE 53
2.3.5: Peaceful Co-Existence with Your Extensible Compiler.
2.3.5: Peaceful Co-Existence with Your Extensible Compiler.
This section presents the results of some experience
with syntax and semantic definition in IMP72, in order to
help others avoid some problems.
The parser works by carrying along all possible parses
of the program up to the symbol it is reading at the moment.
It usually manages to resolve all ambiguity every couple of
symbols at most. If a syntactic production is entered which
requires it to read a large number of symbols before it can
choose which syntax rule it is following, the parser may
require inordinate amounts of space and time. This
condition may be diagnosed by looking at the compiler
statistics at the bottom of the source program listing. If
the Max. Parse Space goes over 1500+500, or the Max.
Output Space is over 150, the condition may exist.
An example of syntax which may cause the problem is:
<EXP> ::= <EXP,A> => <ST,B>;
<EXP> ::= <EXP,A> => <EXP,B> ELSE <ST,C>
When the compiler sees A=>(.......), it is not able to
decide until the right parenthesis whether it is trying to
form an ST or an EXP. Thus it must carry along two parses
while it is parsing the entire contents of the parentheses,
which may be an arbitrarily long expression. The syntax IMP
uses is
<EXP> ::= <EXP,A> => <ST,B>;
<EXP> ::= <EXP,A> => <ST,B> ELSE <ST,C>
which avoids the parser problem (although it introduces the
necessity for parenthesizing the ELSE clause). The method
for storing syntax rules combines all rules whose right-hand
parts begin identically, up to the point at which they
differ. Thus, the parser stores these two rules as:
=:: <EXP>
/
<EXP,A> => <ST,B>
\
ELSE <ST,C> =:: <EXP>
and only carries along one parse until it has finished
parsing the <ST,B>.
Another difficulty which may arise is that special
semantic cases may be specified in a syntax statement, but
the compiler will refuse to recognize them. This may arise
from the way in which the patterns for special cases are
IMP72 REFERENCE MANUAL PAGE 54
2.3.5: Peaceful Co-Existence with Your Extensible Compiler.
recognized. Semantics are performed for subexpressions of a
syntax production from left to right. For an element in a
syntactic conditional to be recognized, it must already have
been reduced by semantics to one object. An example will
illustrate the pitfall:
<EXP> ::= <ATOM,A> NE <EXP,B> ::= .....;
<ST> ::= <EXP,A> => <ST,B> ::=
"A NE 0=>B" => ..... ELSE .....
where the ..... represents semantics whose exact form is
nonessential to the point being illustrated. If the
statement
A NE 0=>X←X+4
is encountered by the compiler, an attempt to match the
special case "A NE 0=>B" is made, but fails since X←X+4 is
not a single object; the semantics for ← and + have not yet
been performed. Since the special case fails, the semantics
for the subexpressions of the statement are invoked, from
left to right. First, the semantics for A NE B is
performed. Checks are made at every step to see if the
current expression matches some special case, but now A NE 0
is a single object, and will only match patterns of the form
"X=>...". Thus the compiler will fail to recognize the
special case "A NE 0=>B".
It is in order to avoid this problem that IMP has
separate productions for A=>B and A <RELOP> B=>C (where
<RELOP> is the class of relational operators).
IMP72 REFERENCE MANUAL PAGE 55
3: How to Compile and Run IMP72 Programs.
3: How to Compile and Run IMP72 Programs.
3.1: Compiling Programs.
The IMP compiler is called by the DECsystem-10 command
R IMP (on TENEX, the command IMP). The compiler will
respond by typing its version number and an asterisk. It
will now accept a command line in the following format:
obj,list←dev:file.ext[pj,pg]/a/b(cd)
All fields are optional. When all are specified, the
compiler will compile a program from source file
file.ext[pj,pg] on device dev. A listing will be written on
file list.LST, and a relocatable object program will be
written on file obj.REL. Listings requested by compiler
switches will be written on file file.LST if no list file is
specified. If the compiler detects an error, a listing will
be produced from that point in the source file on. If the
extension "ext" is omitted, the compiler will look for a
file with null extension, and then for file.IMP.
A, b, c, d are compiler switches, as follows:
/A Produce an Assembly listing.
/C Continue after this file (see below).
/H Help - list the switches available.
/L Produce a source listing.
/R Compile Re-entrant (pure) code.
/U Exit to save compiler.
/V Exit to save low segment of compiler (see below).
/Y List source program on TTY as it is compiled.
If you want to compile the IMP program on file FILE,
the simplest way is to give the compiler the command string
FILE. If a listing is required, the string FILE/A/L will
do.
The /C switch enables a file containing only syntax and
declaration statements to be compiled for the purpose of
making a version of the compiler for an augmented dialect of
IMP. After the file has been compiled, the compiler returns
with another prompt, and the /V or /U switches may be used.
The /U switch causes the compiler to exit, after doing
some housekeeping. At this point, the monitor command SSAVE
NEWIMP will make files NEWIMP.LOW and NEWIMP.SHR (or a file
NEWIMP.SAV on TENEX) which will be a new version of the
compiler, containing any syntax and declarations just
compiled.
IMP72 REFERENCE MANUAL PAGE 56
3.1: Compiling Programs.
The /V switch is similar to the /U switch. It is used
to make an augmented version of the compiler while retaining
the previous shared high segment. The saved version of the
compiler will not contain a sharable high segment, but will,
whenever it is run, fetch the high segment from the previous
version of the compiler. On TENEX, or if the high segment
has been altered, this can not be done. In this case, the
error message HIGH SEGMENT NOT SHARABLE is given, and the
same action is taken as in the case of the /U switch. It is
suggested that the /V switch be used wherever possible,
since the size of the high segment is considerable, and it
is desirable that two different versions of the compiler be
able to share the same high segment, and that identical
copies of the high segment not be stored on disk.
If the compiler should encounter a fatal error, such as
an illegal memory reference, and terminate abnormally, it is
possible to salvage the listing file up to that point.
Execute the command REENTER, and look for a file 005IMP.TMP,
where 005 is your job number.
If the monitor at your installation has been modified
appropriately, it will recognize files with the extension
IMP as IMP72 source files, and the monitor commands COMPILE,
LOAD and EXECUTE may be used to manipulate IMP programs.
The switch "/IMP" may be used to inform these commands that
a file with a non-standard extension does in fact contain an
IMP program.
3.1.1: The Compiler Listing.
The source listing of the compiler has a number at the
left of each line of code. This indicates the level of
parenthesis nesting at the beginning of the line.
Parentheses contained within # or ' signs are not counted.
IMP72 REFERENCE MANUAL PAGE 57
3.2: Compilation Error Diagnostics.
3.2: Compilation Error Diagnostics.
The compiler reports on three classes of errors: fatal
errors, errors, and advisory errors. Advisory errors
indicate a condition which may not be an error but should be
brought to the programmer's attention. They are labeled
ADVISORY. Fatal errors terminate compilation immediately.
Regular errors do not terminate compilation, but subsequent
compilation may be affected.
The philosophy behind the error diagnostics in IMP
presumes a certain amount of maturity on the part of the
programmer. He is less restricted than in a language such
as FORTRAN, but he is also less protected from his own
errors. Thus, a number of potential error conditions that
may be used usefully are not checked for by the compiler.
This is especially true of syntax statements. This is not
to imply that few useful diagnostics are provided, however.
The following are the error diagnostics provided by the
compiler. They are regular errors unless otherwise
indicated.
ATTEMPT TO EXECUTE UNDEFINED SEMANTIC ROUTINE (Fatal): It
wasn't bad enough you referred to a nonexistant
semantic routine in a syntax statement (for which you
already got an error message). You had to go and try
to compile an instance of the producion. This you
won't get away with.
BAD MODIFIER: A semantic condition has an identifier with a
bad modifier.
BAD RESULT OF SEMANTICS; IGNORED (Advisory): A semantic
routine has returned a value for a semantic part
which is not an object on the semantics routine
stack. This may or may not cause trouble later on.
If you have not added semantic routines to the
compiler, report the error to the person responsible
for maintaining the compiler.
CANNOT CREATE .REL FILE (Fatal): For some reason, which
probably has to do with the operating system, a .REL
file for your object program could not be created.
Perhaps another user is writing on the file, or there
is no space available on the storage device.
CALCULATED CONSTANT IS SUBSCRIPTED (Advisory): An expression
has been evaluated to a constant and now an attempt
is being made to subscript it. Since the calculation
yields only one word, this doesn't make much sense.
IMP72 REFERENCE MANUAL PAGE 58
3.2: Compilation Error Diagnostics.
CANNOT GET TWO CONSECUTIVE REGISTERS (Fatal): The assembler
reports that an operation (such as divide) which
requires two consecutive machine registers was not
able to find two consecutive registers not in use.
CASE CONTAINS SUBEXPRESSION WHICH IS CASE BUT WONT
GENERALIZE (Advisory): You are warned that a semantic
condition in a CASE definition contains a
subexpression which is also an instance of the
production in the syntax part. The main expression
will generalize to other instances of the CASE, but
the subexpression won't.
CASE UNDEFINED (Fatal): You tried to reference a CASE
semantics, but the case was never defined.
CODE GENERATION STACK DID NOT REDUCE TO ONE ITEM: The
program parsed correctly, but semantics for some of
the program could not be found. If you added your
own syntax in the program, check it. Otherwise,
report the problem to the person responsible for
maintaining the compiler.
DEBUGGING PROGRAM NOT PRESENT (Advisory): A debugging switch
has called for a printout not available from the user
version of the compiler. Recompile the program using
the version containing the debugging routines.
DUPLICATE TAG: The same label was used as a tag in two (or
more) places.
ERROR CORRECTOR GIVES UP. (Fatal): The error corrector was
unable to corrrect a syntax error.
ERROR CORRECTOR GIVES UP AT % (Fatal): The error corrector
was unable to correct a syntax error, and read past
the % sign while trying.
ERROR IN ADDCHAR - NAME > 50 CHARACTERS: A syntactic class
name is too long.
ERROR IN SYNTAX READ-IN: There is a syntax error in the
syntax for syntax during the first stage of compiler
bootstrapping. Will not occur except when compiler
is first being generated.
FREE STORAGE EXHAUSTED: Your program has demanded more space
to compile than is available as user core. If you
had syntax errors, you may be able to compile when
the errors are repaired. Otherwise, make your
program smaller or find a bigger machine (or a stiff
drink).
IMP72 REFERENCE MANUAL PAGE 59
3.2: Compilation Error Diagnostics.
FREE STORAGE UNLINKED: Fatal error caused by a compiler bug.
Give the listing produced to the person responsible
for maintenance of the compiler.
IN AREG1 - I OR J OUT OF RANGE: Argument to AREG1 negative
or greater than 177B.
N NOT A CONSTANT IN N LONG: N may be a constant expression
but must contain no variables (except those
syntactically defined as constants).
NAME TOO LONG (Fatal): You used an identitfier longer than
100 characters in a syntactic condition. Why?
NESTED SUBROUTINES: Due probably to mismatched parentheses,
a subroutine definition has been initiated before the
previous one was closed out.
NON-OCTAL PJ OR PN: In FILE designator of PRINT or READ
statement, that is.
NUMBER MISSING OR OUT OF PLACE IN SEMANTICS IDENTIFIER: In a
semantic conditional, you had a number in the wrong
format or wrong place.
OUT OF REGISTERS (Fatal): The assembler reports that your
program has managed to fill up all available hardware
registers and then some. If you are using explicit
registers, perhaps you have generated an expression
that you think fits in one register but isn't coming
out that way. If not, you may just have a very
complicated expression somewhere. Split it up.
PREMATURE END OF INPUT FILE: Compiler read the entire input
file but did not encounter the %% which terminates
IMP programs. Perhaps you: 1. Only had one %. 2.
Failed to match # or ' with another one, so that the
% sign got absorbed in a constant/comment. Check the
numbers down the left margin of the listing to see
when they stopped following the parenthesis nesting
of the code. The compiler offers a chance at
redemption in the form of a prompt for a new file
name. Hitting the RETURN key will cause the compiler
to insert its own %% signs to complete the
compilation.
PRODUCTION DUPLICATES ONE ALREADY IN SYNTAX (advisory): The
production in a syntax statement is identical to one
already in the syntax. This is not an error, and is
used to add special semantic cases to already defined
syntax.
IMP72 REFERENCE MANUAL PAGE 60
3.2: Compilation Error Diagnostics.
PUSHDOWN STACK UNDERFLOW ERROR (Fatal): Indicates a compiler
bug. Inform the person responsible for maintaining
the compiler.
QUESTIONABLY DEFINED VARIABLE (Advisory): The indicated
variable was referenced but never (1) had a value
assigned to it, or (2) appeared as an argument of a
subroutine call, or (3) was declared COMMON.
QUOTED SEMANTICS NOT AN INSTANCE OF PRODUCTION: A semantic
condition was not an instance of the production in
the syntax statement in which it appeared.
REGISTER CONFLICT(S) (Advisory): A machine register was
referred to which was already in use by the compiler.
The operation was performed as requested, but
prudence dictates a close inspection of the assembly
listing of the code generated. This message appears
only once regardless of the number of register
conflicts, and is followed on the list file by error
messages for each conflict, indicating the register
involved and the address of the first reference
producing a conflict, relative to the beginning of
the object code.
SEMANTICS STACK OVERFLOW (Fatal): Congratulations! There is
exactly one major table in the compiler which is not
dynamically expandable, and YOUR program has stuffed
it full. This can only happen through the use of a
very complex semantic routine call in a semantics
part of a syntax statement, or maybe a bug in a
semantic routine, or a very long list of names in
your program. Cognoscenti will appreciate the
information that this may result from lots of calls
to ENSTACK in one semantics.
STACK UNDERFLOW IN FSUB (Fatal): Indicates a compiler bug.
Inform the person responsible for maintaining the
compiler.
SUBSCRIPTED REGISTER - IGNORED: The subscript on a register
is disregarded.
SYNTACTIC AMBIGUITY; UNRESOLVED AT %: The parser has found a
syntactic ambiguity which it was not able to resolve
by the end of your program. If your program contains
syntax statements, check them; otherwise report the
error to the person responsible for maintaining the
compiler.
SYNTACTIC AMBIGUITY: The compiler was able to interpret a
portion of your program in two conflicting ways, and
was not able to decide which way you had in mind.
IMP72 REFERENCE MANUAL PAGE 61
3.2: Compilation Error Diagnostics.
The ambiguous segment of the program is indicated on
the listing (after a fashion) by a list of the
identifiers (variable names and constants) which
appear in the segment, printed in reverse order.
Some constructs which may produce this error are:
A=>B=>C ELSE D, which is really ambiguous and should
be parenthesized explicitly, and (A RELOP B)=>C,
where RELOP is any relational operator, which is not
ambiguous, and will generate correct code, but the
parentheses may be removed to get rid of the error
message.
SYNTAX ERROR: The parser has not been able to interpret your
program as a legal IMP expression. It has attempted
to continue the compilation by replacing part of your
program at the point of the error by what it hopes is
is a correction of the error. It is possible that
more errors will result from this correction later
on. (On the other hand, in contrast to the previous
PDP-10 IMP compiler, which quit at the first error,
there will at least BE a later on.)
The offending line will be typed on your teletype,
with a line feed inserted at the point at which the
error was detected.
Syntax errors are handled by the error-correcting
routine, which may take some time on certain errors.
When the error has been corrected, two asterisks are
printed after the message ** SYNTAX ERROR. If this
seems to be taking an unduly long time, you may want
to control-C, execute the command REE, and then look
for a file named 005IMP.TMP (where 005 is your job
number), which contains your listing up to the point
of the error.
TOO MANY ARGS TO SEMANTIC RTN OR PARAM STACK UNDERFLOW
(Fatal): Either you tried to execute a semantic
routine with more than 10 arguments, or else there is
a compiler bug.
TOO MANY LOCAL SYMBOLS IN REFERENCED SEMANTICS (Fatal): You
tried to reference quoted semantics with more than 10
local symbols.
TOO MANY REFERENCES TO ONE REGISTER (Fatal): Either you have
devised a way to refer to a register or register
variable four thousand ninety-six times in your
program (unlikely), or else a linked list of object
code has gotten tied into a loop (regrettable, but
conceivable). If the latter is the case, report to
the person responsible for maintaining the compiler.
This error has also been known to crop up when
programs which have syntactic errors are being
compiled. Although the error corrector always makes
IMP72 REFERENCE MANUAL PAGE 62
3.2: Compilation Error Diagnostics.
corrections which are SYNTACTICALLY CORRECT, there is
no guarantee that they will always be SEMANTICALLY
MEANINGFUL. If you get this error message and your
program contains syntactic errors, correct the errors
and re-compile. This error message should go away.
TOO MANY SYNTAX ERRORS (Fatal): A large number of syntax
errors have been detected in your program, and the
compiler sees no point in proceeding.
TWO LENGTHS IN SAME DECLARATION: Two lengths in same
declaration.
VALUE NOT A CONSTANT: Value in VALUE semantics must be an
18-bit constant.
VECTOR APPEARS IN MORE THAN ONE DECLARATION: The same
variable name was declared N LONG more than once.
IMP72 REFERENCE MANUAL PAGE 63
3.3: Loading and Running.
3.3: Loading and Running.
On DECsystem-10, you may load and run your object
programs by using the appropriate monitor command (.EX,
.LOAD or .DEB), with a list of the files you wish to load.
File IMPLIB.LIB should be loaded in library search mode.
Example:
.EX URPROG.REL,URSUBR.REL,/LIBRARY,IMPLIB.LIB
On TENEX, IMP object programs must be loaded by the
LOADER subsystem, and run with the START command. See the
TENEX User's Guide and the DECsystem-10 Assembly Language
Handbook for more information on the loader.
If your program does any I/O, it should call FINI (see
Section 2.2.7) to complete the I/O operations.
If your program should bomb out, the monitor command
.REE will usually close out your files without losing any
output. DDT users may activate the symbol table using the
name of the .REL file. This is the name specified in the
CALL ME statement, if there was one in the source file, or
else the name of the source file.
3.4: Making a New Compiler.
If it is desired to create a new IMP72 compiler from
scratch, the following procedure should be followed:
1. Make a version of the compiler containing the syntax
on file IMPSYN.IMP. This syntax is necessary to
compile each compiler subroutine. The way to make
the version of the compiler is: run the compiler.
respond to the * prompt with IMPSYN.IMP/C. When
another * prompt is given, type /V. The compiler
will exit to monitor level. Type SSAVE FOO. Now you
have a version of the compiler named FOO. Use it to
compile the source files for the compiler. Be sure
to specify the compiler switch /R when compiling each
source file, in order to generate a sharable high
segment for the compiler.
2. Load and execute all .REL files for the compiler,
together with the IMP I/O library (IMPLIB.LIB). If
you desire a compiler with the compiler debugging
facilities, load DEBUG.REL; otherwise ignore the
IMP72 REFERENCE MANUAL PAGE 64
3.4: Making a New Compiler.
resulting undefined globals and charge on. In any
event, ignore the undefined global PEEK, which will
not be called (except on the Yale computer). The
file R.CMD may be used to load the compiler by the
DECsystem-10 command LOAD @R, or the command string
R.CMD@<ALTMODE> to the TENEX LOADER subsystem.
3. The compiler will start by reading part of file
SYNTAX, and then come back with a SYNTAX BOOTSTRAPPED
message and an asterisk. Tell it to compile the rest
of the syntax on SYNTAX by typing /C<CR>
4. When (after several minutes) the compiler returns
another asterisk, you may type /V<CR>, and then save
the core image, which is now an IMP72 compiler.
5 If you wish to add new syntax, say on file NEWSYN,
instead of /V type NEWSYN/C<CR>, and proceed to step
4.
IMP72 REFERENCE MANUAL PAGE 65
4: Internal Documentation of the IMP72 Compiler.
4: Internal Documentation of the IMP72 Compiler.
This section provides an overview of the organization
of the IMP72 compiler. Documentation of specific routines,
data bases, etc., is included at the beginning of each file
of source code. The intent of the present section is to
provide a frame of reference for the reader who wishes to
tinker with the compiler. For those with a more
intellectual interest in the compiler's workings and design,
[Bilofsky 1973], [Irons 1971] and [Weingart 1973] may be
more edifying.
The compiler is programmed modularly wherever possible,
at some cost in efficiency. (This cost has been
substantially reduced by replacing many subroutine calls by
in-line syntactic macros.) This philosophy was adopted in
order to enable replacement and modification of portions of
the compiler with minimal repercussions. Accordingly, there
is no common data base in the compiler (minor exceptions
exist for the sake of efficiency in the assembly phase).
Necessary information concerning another module's data base
is acquired via subroutine calls.
The compiler may be divided conceptually into the
following sections:
IMP72 REFERENCE MANUAL PAGE 66
4: Internal Documentation of the IMP72 Compiler.
SOURCE DATA BASE FUNCTIONS
FILES
-------------------------------------------------------------
I. Housekeeping.
FREE Free storage. Free storage module. Maintains free
storage, which is dynamically alloca-
ted and used for virtually all tables.
DIR Directory. Directory module.
STACK Semantic rou- Semantic stack module. Maintains a
tines' stack stack for parameters of semantic rtns.
" (provided Bit matrix module. Creates and acces-
by caller) ses arbitrary bit matrices.
" Pushdown stack Pushdown stack module. Provides a PD
stack for anyone who wants.
IMP (none) Driver program.
DEBUG Debugging module, which may be loaded
at compiler generation time.
PMOD Formatted print module.
-------------------------------------------------------------
II. Housekeeping & Parsing.
LEX Input and lexical analyzer.
RSYN Syntax bootstrap routine, and semantics
& RSYN2 for syntax statements.
-------------------------------------------------------------
III. Parsing.
PARSE Parse arrays. Parse module. Calls heavily upon GRAPH
and ERCOR.
ERCOR Parse arrays. Error correction and syntactic disam-
biguation.
GRAPH Syntax graph & Syntax graph module. Constructs and
connectivity accesses the syntax data base.
matrices.
IMP72 REFERENCE MANUAL PAGE 67
4: Internal Documentation of the IMP72 Compiler.
IV. Semantics.
ENTREE Code generation Code tree module. Builds and accesses
tree. code generation tree.
COTREE Code Code generation module. Calls heavily
generation on code tree module. Does code tree
stack. matching & invokes semantic routines.
Maintains code generation stack.
DOSEM Semantics Semantics module. Builds and accesses
semantics array. Oversees semantics
stack.
-------------------------------------------------------------
V. Semantics & Code Generation.
CODE Entries in Semantic routines, mostly target-
& CODE2 semantics machine-dependant. Generate inter-
& CODE3 stack; code mediate format object code.
arrays.
IMPSEM Code and reg- Machine-independant semantic routines.
ister arrays. Declarations and register assignment.
-------------------------------------------------------------
VI. Code Generation and Housekeeping.
AMOD Code array. Assembly module. Performs some final
code optimization.
RMOD Register array. Register allocation module.
OMOD Object code. Relocatable file generation module.
IMP72 REFERENCE MANUAL PAGE 68
4.1: Parsing.
4.1: Parsing.
The general flow of control within the compiler is as
follows: PARSE obtains a stream of input symbols (in the
form of directory indices) from LEX, and produces a stream
of parser output in the form of terminal symbols (negative
directory indices) and production numbers (positive integers
n, conventionally indicated by [n]. In fact, n is an index
in the semantics array, for historical reasons, but need not
be.). The parser output is in Polish Postfix form.
The parser is a bottom-up parser (Irons 1971), driven
through one cycle by each input symbol. At the end of the
cycle, it has a set of possible parses for the input string
so far. It takes one of the following actions, depending on
the number of parses in the set.
1. If there are no parses, a syntax error has occurred.
ERCOR is called to generate a string of symbols which
may be inserted into the parse to form a
syntactically correct expression. (ERCOR uses the
philosophy, but not the exact method, of (Irons
1963)).
2. If there is more than one parse, AMBIG is called to
decide if an ambiguity has generated multiple parses,
and to discard the redundant parses if this is the
case.
3. If there is one unique parse, the portion of the
parser output stream generated since the last time
there was one unique parse is passed to the semantics
portion of the compiler by calling COTREE.
All other functions performed by the compiler,
including addition of syntax and semantics to the compiler
tables, and invoking the assembly phase, are performed by
semantic routines invoked by the productions which PARSE
passes to COTREE for interpretation. Thus the entire flow
of control of the compiler is directed by the syntax and
semantics of the language.
IMP72 REFERENCE MANUAL PAGE 69
4.2: Semantics.
4.2: Semantics.
COTREE maintains a code generation stack, which starts
off as the parser output in postfix form, but is collapsed
as the productions in the stack are applied to their
arguments, by invoking the appropriate semantics for the
productions, to produce a single stack entry (semantic
object) which is inserted back in the stack in place of the
production and its arguments. This could always be be done
at the bottom of the stack (we will consider the working end
to be the bottom), as soon as the parser supplied a
production, were it not for the special semantic cases.
These require that a production not be applied to its
arguments until enough additional parser output has been
supplied to ensure that the production is not part of some
special case.
In order to do this, all special cases are kept in a
code generation tree, which is built and accessed by
routines in the code tree module, ENTREE. The special cases
are entered in the tree as nodes which may match either a
particular class of objects (as specified by the modifiers
in the semantic condition see Section 2.3.3.2), or a
particular production. When a new successor is added to a
node, it is carefully inserted among existing successors so
that the order of successors of a node, from left to right,
is in order of increasing generality. This is done by
ENTREE, and by subroutines INSNO and COMPNO. This order
insures that the most specific case in the tree is the one
to match a given piece of parser output (subject to the
difficulties mentioned in Section 2.3.5), since nodes are
searched left-to-right while attempting a match.
The definition of unconditional semantics for a
production causes the most general possible case of that
production to be entered in the code generation tree,
ensuring that if no special case is found the semantics for
the general case will be invoked.
The terminal nodes of the code generation tree must
always match some production. In order to avoid excessive
fanout to these nodes, the terminal nodes coming from any
given node are condensed into one. The entry for each
production in the semantics array contains a pointer to a
list for that production, in the array PROSEM (part of the
semantics module), containing the names of all the terminal
nodes in the code generation tree where that production may
match, and a pointer back to the semantics array entry for
the semantics to be invoked at that node. The PROSEM entry
for a production also contains a bit vector telling at which
levels of the code generation tree the production may match
some node, either terminal or otherwise. This information
makes it possible for COTREE to ascertain in many cases that
IMP72 REFERENCE MANUAL PAGE 70
4.2: Semantics.
no match is possible starting at a given point in the code
generation stack, without having to search the entire tree.
Certain non-terminal nodes in the code generation tree
may match a class of productions defined by a VALUE kludge.
COTREE performs the following algorithm: It is called
once from PARSE with each item of parser output. The item
is added to the bottom of the code generation stack. If it
is a terminal symbol, no other action is taken. If it is a
production number, an attempt is made to match some
substring in the stack with some pattern in the code
generation tree. A match is attempted starting at each node
in the stack, starting at the top.
1. If a match is found which might be continued past
the end of the stack, no action is taken, and COTREE
returns control to the parser for more parser output.
In this case, more context is needed to determine
whether this is a special case.
2. If no match is found at any position, another symbol
is requested from the parser. This is in fact an
error condition, but is not checked for until
completion of the parse.
3. If a match is found, all the way through to a
terminal node of the code generation tree, the
appropriate semantics are applied.
The semantics are applied by moving the objects in the
matched section of the stack onto the semantics stack, and
calling the semantics module, DOSEM, with the index in the
semantics array of the semantics to be invoked. DOSEM will
usually supply a single object as the result of the
semantics. COTREE then inserts this object into the stack
in place of all the objects and productions matched, and
goes back to look for another pattern match.
Quoted semantics are stored in the semantics array
essentially as parser output, with special indicators for
local symbols and for places where the arguments are to be
inserted (copying the associated code, if any, for all but
the first insertion of each argument). DOSEM performs
quoted semantics by calling CODENT, in the code generation
stack package, to add each item in the quoted semantics onto
the bottom of the code generation stack. CODENT also sets a
flag for COTREE, so that when DOSEM returns control to
COTREE special action is taken to rearrange the stack so
that the entire result of the quoted semantics is inserted
in place of the matched portion of the stack. Thus, it is
quite possible to have semantics generate more items on the
stack than were necessary to invoke it. This process is
IMP72 REFERENCE MANUAL PAGE 71
4.2: Semantics.
used in the semantics for IMP72 conditionals, where several
quoted semantic rewrites of a conditional expression may be
used to put it into a form which generates good object code.
There is also a special switch in COTREE which causes
it not to enter the parser output in the code generation
stack at all, but rather to store it in an array for later
interpretation by certain semantic routines. The semantic
routines MAXWELL and MAXEND turn this switch on and off
repsectively, the production invoking the latter making use
of the PRIORITY mechanism to avoid being stored and ignored
itself. This mechanism is used to implement quoted
semantics and semantic conditionals.
4.3: Semantics and Code Generation.
Semantic routines are invoked by DOSEM, which merely
interprets the entries in the semantics array. The routine
is completely explained by the data structure, which is
well-documented on the source file.
(The confusing routines in the semantics package, and
there are very confusing ones indeed, have to do with the
semantics for syntax, which is discussed below.)
IMP72 REFERENCE MANUAL PAGE 72
4.4: How Extensibility is Implemented.
4.4: How Extensibility is Implemented.
The syntax of IMP72 is stored in a syntax graph and two
associated connectivity matrices, as described in (Irons
1971), except that the graph is not back-optimized, in order
to be able to add new productions without great fuss. New
productions are added to the graph, and the matrix updated,
by subroutine GRAPH, which is called with one production.
The parser accesses the graph via other subroutines in the
syntax graph module.
Initially, the graph is empty. Subroutine RSYN has
built into it a syntax and semantics for very simple syntax
statements. The process of creating an IMP compiler begins
by calling RSYN (from the driver program IMP) to read a
similar syntax and semantics from file SYNTAX. Once this is
done, the compiler presents an asterisk, and is normally
instructed to continue compiling from file SYNTAX. (See
Section 3.4 for the exact procedure.) Now, however, the
compiler is bootstrapped, and uses the syntax and semantics
in its internal tables. The first statements it reads are
the syntax and semantics for more complex syntax statements.
After this comes the syntax for the computational portion of
the language. When this has been compiled, the compiler is
saved. Thus, except for lexical conventions and the syntax
built into RSYN (which is only used to bootstrap the
compiler), the entire syntax and semantics for IMP72 comes
from file SYNTAX and the semantic subroutines referred to on
it.
The other routines on files RSYN and RSYN2 implement
the semantics for syntax statements. They were at once the
first and the most complex semantic routines implemented,
and, to put it kindly, are rather obscure. The routines
SYTRM, SYNT, and SYNTS build up a production in array SN
which is then entered in the syntax graph by semantic
routine SYNTAX. At this point, a list of the names of the
arguments of the production is in array NA for use in
compiling the semantics.
Basically, semantics are compiled into array SEMS in
the semantics package via calls on SETSEM. Semantic routine
calls are compiled by semantic routines SEMP and SEMR, which
check NA to see if identifiers are arguments of the
production.
Quoted semantics are processed by semantic routine
QUOSEM, which converts parser output into semantics which it
enters in the semantics array via SETSEM.
When a semantic condition is encountered, semantic
routine QUOSEM enters the pattern in the code generation
IMP72 REFERENCE MANUAL PAGE 73
4.4: How Extensibility is Implemented.
tree with a call to ENTREE, which returns a list of the
formal arguments of the condition. QUOSEM then calls SEMFIX
(in the semantic package) to process the semantics, which
have already been entered in the semantics array, to conform
with the new list of formal arguments.
When a modifier for a VALUE kludge is encountered on
quoted semantics, semantic routine REPVAL performs the
necessary alteration on the parser output before QUOSEM gets
it.
Semantic package routine SEMFX1 is called to perform a
similar alteration on default case semantics in the event
that a VALUE kludge applies to them.
IMP72 REFERENCE MANUAL PAGE 74
5: Distribution Procedure for IMP72.
5: Distribution Procedure for IMP72.
Until further notice, persons interested in receiving
copies of the IMP72 compiler may follow this procedure:
The master copy of the compiler will be maintained by
the person whose name and address appear below. From time
to time, as improvements are made and bugs are unmade, a new
version of the compiler will be issued, containing all
updates and corrections known to that point, and compatible
with previous versions except in cases of extreme necessity.
A version will be released on two DECtapes, containing
the following:
1. Files IMP.LOW and IMP.SHR: the compiler.
2. A file IMP.RNO: the RUNOFF source for this
manual.
3. A file IMPLIB.LIB: the I/O library for IMP72
programs.
4. Source files for the compiler (with extension
IMC).
5. Source files for the I/O library (with extension
IML).
6. A file FORTIO.MAC, which contains utility
subroutines included in IMPLIB.LIB. FORTIO.HLP is
also included for the curious.
7. A file R.CMD which can be used to load the
compiler via .LOAD @R.
Anyone wishing to obtain a copy of the latest release
is invited to send two DECtapes to:
WALTER BILOFSKY
BOLT BERANEK AND NEWMAN, INC.
50 MOULTON STREET
CAMBRIDGE, MASS. 02138
Well-documented information concerning specific bugs in
the compiler, if couched in gentle and soothing terms, and
suggestions, briefly expressed, will also be received
cheerily at the above address. There is no guarantee,
however, that the fantastic improvements you have made in
YOUR IMP72 will be included in later releases of OUR IMP72.
Therefore, those interested in utilizing future releases
with minimal anguish factor are advised to confine
modifications wherever possible to
IMP72 REFERENCE MANUAL PAGE 75
5: Distribution Procedure for IMP72.
1. New syntax, on files to be compiled by IMP72, or,
failing that,
2. New semantic subroutines on source files separate
from the original compiler source files.
Notwithstanding any of the above or below, neither Yale
University, its Department of Computer Science, its graduate
students, faculty or employees, Bolt Beranek and Newman,
Inc., the author, their immediate families, friends or
neighbors, assert or assume any responsibility for the
accuracy of the material contained herein, for the continued
maintainence, development, or availability of the IMP72
compiler, or for any difficulties resulting from its
mis-use, -application, or -understanding, or for your two
DECtapes.
Caveat Usor! Good luck!
IMP72 REFERENCE MANUAL PAGE 76
Appendix I. Utility Library Subroutines.
Appendix I. Utility Library Subroutines.
The IMP run-time library, file IMPLIB.LIB, contains a
package of subroutines called FORTIO, originally programmed
by Ken Shoemake at the Yale Department of Computer Science
to enable FORTRAN programmers to interface with the
DECsystem-10 monitor from a higher-level language. These
routines are useful to IMP programmers working under
DECsystem-10, and are therefore described here.
Many of the functions perform the DECsystem-10 UUO's of
the same name, and more information about them can be found
in the DECsystem-10 Assembly Language Handbook. In the
listing below, optional arguments are enclosed in angle
brackets. Unless otherwise stated, string arguments to
these functions are IMP ASCII strings.
SAVREG(array): saves all registers in the 16-long array
array.
RSTREG(array): restores all registers from the 16-long array
array.
ARGCNT(): returns the number of arguments the calling
program was called with; undefined for a main
program.
CALLI(fn,<v1,v2>): performs the CALLI UUO. Fn is either the
function name or the corresponding octal number.
V1, and v2 if appropriate, are the values for AC
and AC+1. CALLI returns the value returned in the
accumulator from the UUO. N.B.: Where fn is a
constant less than 4096 and v1 but not v2 is
present, this will compile as an in-line CALLI
(see Section 2.2.9).
NOSKIP(): returns -1 if the last call to the routine CALLI
resulted in a no-skip return, 0 otherwise.
SIXBIT(stg): returns the ASCII string stg (six characters or
less) converted to SIXBIT.
ASCII(sixb): returns the sixbit word sixb, converted to
ASCII. The second word of the result is returned
in register 1R.
RAD50(stg,<bits>): returns the value of string stg as radix
50, with bits in the four code bits.
RADX50(sixb,<bits>): returns the value of the sixbit word
sixb as radix 50, with bits in the four code bits.
IMP72 REFERENCE MANUAL PAGE 77
Appendix I. Utility Library Subroutines.
ASCI50(r50,<bits>): returns the value of the radix 50 symbol
r50 converted to ASCII (second word in register
1R). If bits is present, it is set to the value
of the code bits.
SIX50(r50,<bits>): returns the value of the radix 50 symbol
r50 converted to sixbit. If bits is present, it
is set to the value of the code bits.
OPEN(ch,ary): performs the OPEN UUO on channel ch with the
address of the three-word array ary as the
effective address. Returns 0 if no error, -1
otherwise.
INIT(ch,stat,<dev,obuf,ibuf>): performs the INIT UUO with
the specified information. 'DSK' is the default
device, and if obuf or ibuf contains a
left-justified ASCII '0' the corresponding buffer
is omitted in the call.
INBUF(chan,n): performs the INBUF UUO with the specified
information.
OUTBUF(chan,n): performs the OUTBUF UUO with the specified
information.
RENAME(chan,ary)
RENAME(chan,name,ext,<pj,pg>): performs the RENAME UUO
using, in the first form, the argument block in
the array ary, and in the second form, the
arguments shown. Returns 0 if no error, otherwise
-1-(error nr.).
LOOKUP(chan,ary)
LOOKUP(chan,name,ext,<pj,pg>): performs the LOOKUP UUO;
similar to RENAME q.v.
ENTER(chan,ary)
ENTER(chan,name,ext,<pj,pg>): performs the ENTER UUO;
simplar to RENAME q.v.
RELEA(chan): releases the specified channel.
CLOSE(chan,<n>): performs the CLOSE chan,n UUO.
IN(chan,<LOC(arry)>)
INPUT(chan,<arry>): performs the IN UUO . Returns -1 for an
error return, 0 otherwise.
OUT(chan,<LOC(arry)>)
OUTPUT(chan,<arry>): performs the OUT UUO. Returns -1 for
an error return, 0 otherwise.
IMP72 REFERENCE MANUAL PAGE 78
Appendix I. Utility Library Subroutines.
GETSTS(chan): returns the result of a GETSTS for the
specified channel.
SETSTS(chan,status): performs the SETSTS UUO for the
specified channel and status word.
STATO(chan,bits): returns -1 if a STATO for the specified
channel gave a skip return, 0 otherwise.
STATZ(chan,bits): returns -1 if a STATZ for the specified
channel gave a skip return, 0 otherwise.
GETCHA(): returns the number of a channel which is not open,
starting with channel 17B. If no channels are
available, returns -1.
MTAPE(chan,n): performs the indicated MTAPE function.
UGETF(chan): returns the result of a UGETF UUO on the
specified channel.
USETI(chan,blknum): performs the USETI UUO for the specified
channel and block.
USETO(chan,blknum): performs the USETO UUO for the specified
channel and block.
INCHRW(): returns the value of a right-justified character
from the teletype, waiting until one is typed.
OUTCHAR(char): types the specified right-justified character
on the teletype.
INCHRS(): returns a character (right-justified) read from
the teletype, or -1 if none has been typed.
INCHWL(): returns a character (right-justified) read from
the teletype, waiting until a break character
(e.g., carriage return or altmode) has been typed.
This is the preferred way to read from the
teletype as it allows the person typing to edit
his line using rubout and control-U.
INCHSL(): same as INCHWL, but returns -1 if a line has not
been typed.
RESCAN(bit): backs up the teletype scan past one break
character, in order to allow rereading of the last
line (possibly a command) typed. If bit is 1,
returns -1 if no command is in the buffer.
SKPINC(): returns -1 if a character has been typed, 0
otherwise.
IMP72 REFERENCE MANUAL PAGE 79
Appendix I. Utility Library Subroutines.
SKPINL(): returns -1 if a line has been typed, 0 otherwise.
OUTSTR(string): types the specified ASCII string on the
teletype.
IONEOU(char): types a single 8-bit character on the
teletype.
GETLCH(<line>): returns the line characteristics for the
specified line; assumes the caller's line if none
is specified.
SETLCH(value): sets the line characteristics for the
caller's line.
CLRBFI(): clears the input buffer.
CLRBFO(): clears the output buffer.
IMP72 REFERENCE MANUAL PAGE 80
Appendix II: Syntax of IMP72.
Appendix II: Syntax of IMP72.
<NAM> ::= NAM"
<SYM> ::= SYM"
<SPT> ::= <SPT,A> <SYM,B>
<SPT> ::= <SPT,A> '<' <NAM,B>,<NAM,C> >
<SMA> ::= <NAM,A> ( )
<SMA> ::= <SPN,A> ( <SPL,B> )
<SPN> ::= <NAM,A>
<SPL> ::= <SPI,A>
<SPL> ::= <SPL,A> , <SPI,B>
<SPI> ::= <NAM,A>
<CSM> ::= <SMA,A>
<SMP> ::= <CSM,A>
<ST> ::= <SPT,A>
<SAU> ::= =
<ST> ::= <SPT,A>':' ':'<SAU,C><SMP,B>
<STL> ::= <ST,A>
<STL> ::= <STL,A> ';' <ST,B>
<PG> ::= <STL,A> '%'
<SPI> ::= <SPI,A> + <SPI,B>
<SPT> ::= <SPT,A> '<' <NAM,B> >
<SPT> ::= '<'<NAM,A>>':' ':'='<'<NAM,B>>
<SSP> ::= VALUE <NAM,A> OF <NAM,B>
<SSP> ::= PRIORITY <NAM,A> <SMA,B>
<SSP> ::= PRIORITY <NAM,A> <SSP,B>
<ST> ::= <SPT,A>':' ':'<SAU,B><SSP,C>
<SMP> ::= <SMP,A> ELSE <CSM,C>
<CSM> ::= CASE (<NLIST,A>) OF <NAM,B>
<CSM> ::= CASE (<NLIST,A>) OF <CASENAM,B> (<SMP,C>)
<CASENAM> ::= <NAM,A>
<SQUOTE> ::= "
<QUOSEM> ::= <SQUOTE,A> <STL,B> "
<QUOSEM> ::= <NAM,A> / <QUOSEM,B>
<CSM> ::= <QUOSEM,A> => <SMA,B>
<CSM> ::= <QUODEF,A>
<CSM> ::= <QUOSEM,A> => <QUODEF,B>
<QUODEF> ::= <QUOSEM,A>
<QUODEF> ::= LOCAL <NLIST,A> IN <QUOSEM,B>
<VBL> ::= <NAM,A>
<ION> ::= <VBL,A>
<ATOM> ::= <ION,A>
<EXP> ::= <ATOM,A>
<ST> ::= <EXP,A>
<ION> ::= (<STL,A>)
<ST> ::= GO TO <EXP,A>
<ST> ::= <NAM,A> ':' <ST,B>
<ION> ::= <NAM,A> ( )
<ION> ::= <NAM,A> ( <ELIST,B> )
<ST> ::= <NLIST,A> IS <PLIST,B>
<ST> ::= <NLIST,A> ARE <PLIST,B>
<NLIST> ::= <NAM,A>
IMP72 REFERENCE MANUAL PAGE 81
Appendix II: Syntax of IMP72.
<NLIST> ::= <NLIST,A> , <NAM,B>
<PLIST> ::= <PROP,A>
<PLIST> ::= <PLIST,A> , <PROP,B>
<PROP> ::= <NAM,A>
<PROP> ::= <EXP,A> LONG
<PROP> ::= COMMON
<PROP> ::= REAL
<PROP> ::= INTEGER
<PROP> ::= REGISTER
<PROP> ::= RESERVED
<PROP> ::= SCRATCH
<PROP> ::= PROTECTED
<PROP> ::= AVAILABLE
<PROP> ::= RELEASED
<PROP> ::= LOCAL
<SYN> ::= LET
<SYN> ::= <SYN,A> <NAM,B>=<VBL,C>,
<ST> ::= <SYN,A> <NAM,B>=<VBL,C>
<ATOM> ::= - <ATOM,A>
<ATOM> ::= NOT <ATOM,A>
<EXP> ::= <ATOM,I> LS <EXP,J>
<EXP> ::= <ATOM,A> ALS <EXP,B>
<EXP> ::= <ATOM,A> LROT <EXP,B>
<EXP> ::= <ATOM,A> RS <EXP,B>
<EXP> ::= <ATOM,A> ARS <EXP,B>
<EXP> ::= <ATOM,A> RROT <EXP,B>
<EXP> ::= <ATOM,Y> * <EXP,Z>
<EXP> ::= <ATOM,A> + <EXP,B>
<EXP> ::= <ATOM,X> - <EXP,Y>
<EXP> ::= <ATOM,A> / <EXP,B>
<EXP> ::= <A>//<B>
<EXP> ::= <ATOM,X> AND <EXP,Y>
<EXP> ::= <ATOM,A> OR <EXP,B>
<EXP> ::= <ATOM,A> XOR <EXP,B>
<EXP> ::= <ATOM,A> EQV <EXP,B>
<EXP> ::= <VBL,A> ← <EXP,B>
<EXP> ::= <VBL,A> '<'= <EXP,B>
<VBL> ::= <VBL,A> [<EXP,B>]
<VBL> ::= [<EXP,A>]
<ST> ::= SUBR <SUBP,A> IS <EXP,B>
<SUBP> ::= <NAM,A> ( <NLIST,B> )
<SUBP> ::= <NAM,A> ( )
<ST> ::= RETURN <A>
<ST> ::= GO TO ( <GOLST,A> ) <B>
<GOLST> ::= <NAM,A>
<GOLST> ::= <GOLST,A> , <NAM,B>
<ELIST> ::= <EXP,A>
<ELIST> ::= <ELIST,A>,<B>
<ST> ::= DATA ( <ELIST,A> )
<ST> ::= REMOTE <ST,A>
<ATOM> ::= LOC ( <VBL,A> )
<RELOP> ::= =
<RELOP> ::= '<'
IMP72 REFERENCE MANUAL PAGE 82
Appendix II: Syntax of IMP72.
<RELOP> ::= >
<RELOP> ::= NE
<RELOP> ::= LE
<RELOP> ::= GE
<RELOP> ::= EQ
<RELOP> ::= LT
<RELOP> ::= GT
<ST> ::= <EXP,A> => <ST,B>
<ST> ::= <EXP,A> <RELOP,EQ> <EXP,B>
<ST> ::= <EXP,A> <RELOP,EQ> <EXP,B> => <ST,C>
<ST> ::= <EXP,A> <RELOP,EQ> <EXP,B> => <ST,C> ELSE <ST,D>
<ST> ::= <EXP,A> <RELOP,EQ> <EXP,C>
<ST> ::= MOVE <B> THROUGH <N> TO <A>
<ST> ::= <EXP,A> FOR <VBL,B> IN <EXP,C>,<EXP,D>,<EXP,E>
<ST> ::= <EXP,A> FOR <VBL,B> TO <EXP,C>
<ST> ::= <EXP,A> FOR <VBL,B> FROM <EXP,C>
<ST> ::= WHILE <EXP,A> DO <ST,B>
<ST> ::= WHILE <EXP,A><RELOP,EQ><EXP,B> DO <ST,C>
<ST> ::= <EXP,A> UNTIL <B><RELOP,EQ><C>
<ST> ::= <EXP,A> UNTIL <EXP,B>
<COND> ::= (<A> <RELOP,EQ> <B>)
<COND> ::= (<A> <RELOP,EQ> <B>) OR <COND,C>
<COND> ::= (<A> <RELOP,EQ> <B>) AND <COND,C>
<ST> ::= <COND,A> => <ST,B>
<ST> ::= <COND,A> => <ST,B> ELSE <ST,C>
<ST> ::= WHILE <COND,A> DO <ST,B>
<ST> ::= <EXP,A> UNTIL <COND,B>
<IO> ::= PRINT <PITEM,A>
<IO> ::= READ <PITEM,A>
<IO> ::= <IO,A> , <PITEM,B>
<EXP> ::= <IO,A>
<PITEM> ::= <EXP,A>
<PITEM> ::= OCT <A>
<PITEM> ::= IGR <A>
<PITEM> ::= STG <A>
<PITEM> ::= FILE <NAM,A>
<PITEM> ::= FILE <NAM,A>.<B>
<PITEM> ::= FILE <NAM,A>[<C>,<D>]
<PITEM> ::= FILE <NAM,A>.<B>[<C>,<D>]
<PITEM> ::= /
<PITEM> ::= DEVICE <A>
<PITEM> ::= IMAGE MODE
<PITEM> ::= TAB <N>
<PITEM> ::= FILL <N>
<PITEM> ::= FLT <A>.<B>
<VBL> ::= <NAM,A>.<NAM,B>
<VBL> ::= <NAM,A>.<NAM,B>"<ATOM,C>
<ATOM> ::= FIX(<A>)
<ATOM> ::= FLT(<A>)
<BYTE> ::= <VBL,A> '<' <B> , <C> >
<ATOM> ::= <ION,A> '<' <B> , <C> >
<EXP> ::= <BYTE,A> ← <B>
<ATOM> ::= BYTEP <BYTE,A>
IMP72 REFERENCE MANUAL PAGE 83
Appendix II: Syntax of IMP72.
<BYTE> ::= '<' <EXP,A> >
<ATOM> ::= '<' <EXP,A> >
<ATOM> ::= '<' + <VBL,A> >
<EXP> ::= '<' + <VBL,A> > ← <B>
<ATOM> ::= <ION,A> '<' R >
<ATOM> ::= <ION,A> '<' L >
<EXP> ::= <VBL,A> '<' R > ← <B>
<EXP> ::= <VBL,A> '<' L > ← <B>
<ST> ::= EXECUTE <A>
<ST> ::= CALL ME <NAM,A>
<ST> ::= TWOSEG
<ATOM> ::= IDPB(<A>,<B>)
<ATOM> ::= ILDB(<A>)
<ATOM> ::= R<VBL,A>
<ION> ::= CALLI(<A>,<B>)
<ATOM> ::= XWD <A>,<B>
<ATOM> ::= IOWD <A>,<B>
IMP72 REFERENCE MANUAL PAGE 84
References
References
Bilofsky 1972: IMP Reference Manual, Working Paper No.
349, Institute for Defense Analyses, Princeton, N.J.
Bilofsky 1973: Syntax Extension and the IMP Programming
Language, unpublished.
Emerson 1849: Representative Men.
Irons 1963: An Error-Correcting Parse Algorithm, Comm. of
the ACM, Vol. 6 No. 12, Dec. 1963
Irons 1970: Experience with an Extensible Language, Comm.
of the ACM, Vol. 13 No. 1, Jan. 1970
Irons 1971: Syntax Graphs and Fast Context Free Parsing,
Research Report No. 71-1, Department of Computer
Science, Yale University.
Naur 1963: Revised Report on the Algorithmic Language ALGOL
60. Comm. of the ACM, Vol. 6 No. 1, Jan. 1963.
Section 1.1 defines BNF.
Shakespeare 1612: The Tempest
Webster's 1967: Webster's 7th New Collegiate Dictionary,
G&C Merriam Co., Pub., 1967.
Weingart 1973: An Efficient and Systematic Method of Code
Generation, Ph.D. Thesis, Department of Computer
Science, Yale University, June 1973.
IMP72 REFERENCE MANUAL PAGE 85
Index
INDEX
ADDOP . . . . . . . . . . . . 38
ambiguity, syntactic . . . . . 52
AREG1 . . . . . . . . . . . . 38
ARGCNT . . . . . . . . . . . . 76
arguments of subroutines . . . 9
arithmetic operators . . . . . 14
ASCI50 . . . . . . . . . . . . 77
ASCII . . . . . . . . . . . . 76
ATOM . . . . . . . . . . . . . 31
AVAILABLE . . . . . . . . . . 26
binary operators . . . . . . . 14
block transfers . . . . . . . 17, 28
bugs, reporting of . . . . . . 74
BYTE . . . . . . . . . . . . . 31
byte pointers . . . . . . . . 19
BYTEP . . . . . . . . . . . . 19
bytes . . . . . . . . . . . . 19
CALL ME . . . . . . . . . . . 27
CALLI . . . . . . . . . . . . 27, 76
CASE . . . . . . . . . . . . . 49
CLOSE . . . . . . . . . . . . 77
CLRBFI . . . . . . . . . . . . 79
CLRBFO . . . . . . . . . . . . 79
COMMON declaration . . . . . . 25
compilation speed . . . . . . 5
compiling IMP72 programs . . . 55
conditional expressions . . . 15
conditional semantics . . . . 46
constants . . . . . . . . . . 11
control expressions . . . . . 15
DATA . . . . . . . . . . . . . 26
DDT symbol table activation . 63
declarations . . . . . . . . . 24
device specification (I/O) . . 23
DEWOP . . . . . . . . . . . . 40
duplicate productions . . . . 48
ELSE . . . . . . . . . . . . . 16
ENTER . . . . . . . . . . . . 77
error diagnostics . . . . . . 57
EXECUTE . . . . . . . . . . . 27
EXP . . . . . . . . . . . . . 31
expressions . . . . . . . . . 8, 11
extensibility, implementation of 72
extension, syntactic . . . . . 29
FETCH . . . . . . . . . . . . 41
IMP72 REFERENCE MANUAL PAGE 86
Index
file specifications . . . . . 23
FINI . . . . . . . . . . . . . 21
flexadecimal . . . . . . . . . 11
floating constant . . . . . . 12
FOR loops . . . . . . . . . . 16
format specifications (I/O) . 22
formatted I/O . . . . . . . . 20
FORTRAN, compatibility with . 9, 18, 25
GETCHA . . . . . . . . . . . . 78
GETLCH . . . . . . . . . . . . 79
GETSTS . . . . . . . . . . . . 78
GO TO . . . . . . . . . . . . 17
HOOK . . . . . . . . . . . . . 41
IN . . . . . . . . . . . . . . 77
INBUF . . . . . . . . . . . . 77
INCHRS . . . . . . . . . . . . 78
INCHRW . . . . . . . . . . . . 78
INCHSL . . . . . . . . . . . . 78
INCHWL . . . . . . . . . . . . 78
INIT . . . . . . . . . . . . . 77
INPUT . . . . . . . . . . . . 77
input file, default . . . . . 21
input/output . . . . . . . . . 20
internal documentation . . . . 65
ION . . . . . . . . . . . . . 31
IONEOU . . . . . . . . . . . . 79
IOWD . . . . . . . . . . . . . 27
iteration . . . . . . . . . . 16
kludge . . . . . . . . . . . . 50
LET statement . . . . . . . . 24
listing . . . . . . . . . . . 56
loading object programs . . . 63
LOC . . . . . . . . . . . . . 13
LOCAL . . . . . . . . . . . . 30
LOCAL declaration . . . . . . 9, 25
local variables . . . . . . . 30
local versions of IMP, suggestions for making 74
logical operators . . . . . . 15
LONG . . . . . . . . . . . . . 25
LOOKUP . . . . . . . . . . . . 77
machine language programming . 9
macros, syntactic . . . . . . 29
modifiers (in semantic conditions) 46
MTAPE . . . . . . . . . . . . 78
NAME . . . . . . . . . . . . . 31
NOSKIP . . . . . . . . . . . . 76
IMP72 REFERENCE MANUAL PAGE 87
Index
NOT . . . . . . . . . . . . . 13
objects (semantic routine arguments) 34
OPEN . . . . . . . . . . . . . 77
operators, arithmetic . . . . 14
operators, binary . . . . . . 14
operators, logical . . . . . . 15
operators, relational . . . . 15
operators, shift . . . . . . . 15
operators, unary . . . . . . . 13
order of evaluation . . . . . 11
OUT . . . . . . . . . . . . . 77
OUTBUF . . . . . . . . . . . . 77
OUTCHR . . . . . . . . . . . . 78
OUTPUT . . . . . . . . . . . . 77
output file, default . . . . . 21
output files, terminating . . 21
OUTSTR . . . . . . . . . . . . 79
parentheses . . . . . . . . . 13
parser . . . . . . . . . . . . 68
partial word access . . . . . 19
PDP-11 IMP . . . . . . . . . . 5
PRINT . . . . . . . . . . . . 20
priority semantics . . . . . . 51
PROTECTED . . . . . . . . . . 26
pure code . . . . . . . . . . 27
RAD50 . . . . . . . . . . . . 76
RADX50 . . . . . . . . . . . . 76
READ . . . . . . . . . . . . . 20
REAL . . . . . . . . . . . . . 25
recursive syntax definition . 32
reentrant code . . . . . . . . 27
REGISTER . . . . . . . . . . . 25
register names . . . . . . . . 9
REGOF . . . . . . . . . . . . 42
relational operators . . . . . 15
RELEA . . . . . . . . . . . . 77
RELEASED . . . . . . . . . . . 26
REMOTE . . . . . . . . . . . . 26
RENAME . . . . . . . . . . . . 77
RESCAN . . . . . . . . . . . . 78
RESERVED . . . . . . . . . . . 25
RETURN . . . . . . . . . . . . 19
RSTREG . . . . . . . . . . . . 76
running object programs . . . 63
SAVREG . . . . . . . . . . . . 76
scope of variables . . . . . . 9
SCRATCH . . . . . . . . . . . 26
semantic part . . . . . . . . 30
semantic routines . . . . . . 34
IMP72 REFERENCE MANUAL PAGE 88
Index
semantic routines, arguments of 34, 36
semantic routines, table of . 38
semantics . . . . . . . . . . 29, 34
semantics, conditional . . . . 46
SETLCH . . . . . . . . . . . . 79
SETSTS . . . . . . . . . . . . 78
shift operators . . . . . . . 15
SIX50 . . . . . . . . . . . . 77
SIXBIT . . . . . . . . . . . . 76
size of compiler . . . . . . . 5
SKPINC . . . . . . . . . . . . 78
SKPINL . . . . . . . . . . . . 79
source format . . . . . . . . 8
special cases, implementation of 69
special cases, order of recognizing 53
ST . . . . . . . . . . . . . . 31
statements . . . . . . . . . . 8
STATO . . . . . . . . . . . . 78
STATZ . . . . . . . . . . . . 78
STL . . . . . . . . . . . . . 31
string constants . . . . . . . 12
SUBR . . . . . . . . . . . . . 18
subroutine definitions . . . . 18
subroutine linkage . . . . . . 18
switches, compiler . . . . . . 55
syntactic ambiguity . . . . . 52
syntactic classes . . . . . . 31
syntactic macros . . . . . . . 29
syntax definition, recursive . 32
syntax extension . . . . . . . 29
syntax of IMP . . . . . . . . 80
syntax part . . . . . . . . . 30
syntax statement . . . . . . . 30, 31
TAB . . . . . . . . . . . . . 24
tags . . . . . . . . . . . . . 17
terminating output files . . . 21
transfer of control . . . . . 17
TWOSEG . . . . . . . . . . . . 27
UGETF . . . . . . . . . . . . 78
unary operators . . . . . . . 13
UNTIL . . . . . . . . . . . . 17
USETI . . . . . . . . . . . . 78
USETO . . . . . . . . . . . . 78
utility subroutines . . . . . 76
VALUE . . . . . . . . . . . . 50
variables . . . . . . . . . . 11
VBL . . . . . . . . . . . . . 31
vector declarations . . . . . 25
version number of compiler . . 10
IMP72 REFERENCE MANUAL PAGE 89
Index
WHILE loops . . . . . . . . . 17
XWD . . . . . . . . . . . . . 27